Summary: Biosystems Data Analysis

Study material generic cover image
  • This + 400k other summaries
  • A unique study and practice tool
  • Never study anything twice again
  • Get the grades you hope for
  • 100% sure, 100% understanding
Use this summary
Remember faster, study better. Scientifically proven.
Trustpilot Logo

Read the summary and the most important questions on Biosystems Data Analysis

  • A: Data properties, preprocessing, p-value

    This is a preview. There are 26 more flashcards available for chapter 04/01/2021
    Show more cards here

  • OMICS is not high-troughput data. Explain what it is then?

    • High troughput = quantification of  a single component in a large number of samples (in short time)
      • so determine one concentration in many samples
    • Multiplex technologies = quantification of a large number of (related) components in a single sample
      • --> OMICS
  • Give an overview of the RNA-seq procedure.

    • Stop all activity in the sample (quenching)
    • Isolate mRNA
    • Reverse transcription (RNA --> DNA)
    • Optional amplification by PCR
    • Library construction (attaching sequence tags/adaptors)
    • Sequencing
  • Which points should you consider in regard to quenching?

    • RNA's have short half-lifes in cells
    • RNAases have to be stopped
    • Stress of handling can induce gene expression
    • Breaking cells (bacteria) can be hard
    • Obtaining sample can be time-consuming
  • Which points should you consider for RNA isolation?

    • Most RNA is ribosomal RNA
    • Eukaryotic mRNA can be enriched using poly-A hybridization
  • Name the two types of variation.

    • Biological variation
      • = variation of interest
      • Variation between similar samples (individuals)
    • Technical variation
      • filter out
  • Name sources of technical variation (eg RNAseq).

    • Sample preparation (media, temperature...)
    • Sample isolation (handling, speed of sequencing)
    • Differences in mRNA quality
    • cDNA synthesis
    • Amount of cDNA added
    • Sequence bias
    • Random measurement error (you cannot get rid of it unless you repeat it many times and take the average)
  • Where does bias in RNA seq come from?

    • Variation in amount of isolated mRNA
    • Variation in quality of isolated mRNA
    • Variation in quenching efficiency
    • Variation in cDNA synthesis efficiency
    • Variation in sequencing efficiency (number of sequences read)

    Or, due to interesting biological variation in amount of mRNA!
  • Give 3 subtle bias effects in RNA-seq.

    • Fragment length (size selection)
    • Position (degredation)
    • Sequence bias (high GC --> lower counts)
  • When executing normalization, you need to assume hypotheses about the origin of observed variations in sequencing counts. Give the two hypotheses and subsequent procedures.

    • H1: Approximately equal concentration of mRNA in each sample
      • Implies variations in total counts per sample are due to technical reasons
      • Solution: divide sequence count for each gene by the total sequence count of the sample
      • Result: RPM: Rate Per Million reads
    • H2: Approximately constant number of sequences per kilobase of mRNA
      • Variations in counts between genes are due to gene length
      • Solution: divide sequence count for each gene by total sequence count and by length of gene in kilobases
      • Result: RPKM: Rate Per Kilobase per Million Reads
  • How can we use RPM and RPKM?

    • RPM: Allows comparison of the same gene between samples
    • RPKM: Allows comparison of same gene between samples, and on top of that, between different genes
      • but underlying hypothesis = debatable (subtle bias

To read further, please click:

Read the full summary
This summary +380.000 other summaries A unique study tool A rehearsal system for this summary Studycoaching with videos
  • Higher grades + faster learning
  • Never study anything twice
  • 100% sure, 100% understanding
Discover Study Smart

Topics related to Summary: Biosystems Data Analysis