CSC483 HW05
CSC 483: Homework 5, Due Feb 10, 2026, at 1 pm
1. Begin your project notebook (2 pts + 2pts in PR)
- Make a copy of the project guidelines:
https://colab.research.google.com/drive/1xqpoZnPbJjISUX3gsE59iw5CO-XQ6nNc?usp=sharing
Read through the guidelines, but then you can delete the markdown cells with instructions, they are available on the course website.
Load whichever dataset and meta-data from first_data or second_data you are most interested in. Careful of data sizes!
Add an exploratory plot with a similar level of finesse as we plotted in class (as in it does not need to be perfect), like a PCA or t-SNE.
Add a markdown cell with some notes doing your best to describe why and how we are making PCA and t-SNE plots.
Note: This may seem repetetive to what we did in class, and feel free to copy and paste your own work from notebook to notebook. In starting a new notebook you will have the chance to create a more organized analysis that combines elements of first_data and second_data without some of the extra bits I added as examples. Most notably, you only have to load the datasets you are actually examining rather than the multiple meta-data files and multiple RNA-seq files I’ve loaded in first_data and second_data. You do not need to complete all of the following, but for reference, recall the in-class instructions from the second_data notebook:
Decision points
- use smaller rnaseq datasets ( filtered 6k x 16k) OR use gene P/A
- reduce dim of genes OR dim of samples (transpose or not to transpose)
- connect to metadata of samples OR of genes
- red by genes means plotting samples and vv
- merge rnaseq with sample metadata, use SRX numbers
- merge gene p/a with annotation, use first_name_comp
- merge rnaseq with gene annotation use first_name_comp
- Choose a color scheme
- Does DR technique separate out groups, as seen by color?
Mini-steps
- color all points on PCA NOT blue
- color half the points one color, half another
- color the first sample a unique color
- load metadata (as we did in first_data)
- turn pca from numpy array to pd df
- figure out rownames for PCA df, based on column names df4
2. Update Check-in Notebook (1 pts)
Move your reflections from HW4 into your check-in notebook.
Review your check-in notebook and scores for all previous HW assignments.
Include a Markdown chunk that attests to my notes or requests a correction/update.