Analyzing scRNA-Seq data with XGBoost

Posted on Wed 10 April 2024 in R • Tagged with Bioinformatics, gene expression

Introduction

Breast cancer is one of the most important morbidity and mortality cases around the world. In 2022, 2.3 million women were diagnosed with breast cancer and about 670,000 died from the disease, according to the World Health Organization.

Traditional breast cancer treatment with chemotherapy may be complicated …


Continue reading

Parallelization with R

Posted on Mon 31 July 2023 in R • Tagged with Bioinformatics, gene expression, edgeR, furrr

Introduction

Sometimes, some computations can be carried out in parallel. Certain large tasks can be divided into independent ones, allowing them to be solved at the same time, rather than waiting for each task to be solved sequentially.

I find the native R parallel functions such as mclapply(), or those …


Continue reading

Machine Learning with Python: Supervised Classification of TCGA Prostate Cancer Data (Part 1 - Making Features Datasets)

Posted on Thu 05 November 2020 in Python • Tagged with Bioinformatics, gene expression, machine learning, supervised classification

Introduction

In a previous post, I showed how to retrieve The Cancer Genome Atlas (TCGA) data from the Cancer Genomics Cloud (CGC) platform. I downloaded gene expression quantification data, created a relational database with PostgreSQL, and created a dataset uniting the raw quantification data for 675 differentially expressed genes identified …


Continue reading

Machine Learning with Python: Supervised Classification of TCGA Prostate Cancer Data (Part 2 - Making a Model)

Posted on Thu 05 November 2020 in Python • Tagged with Bioinformatics, gene expression, machine learning, supervised classification

Introduction

In a previous post, I showed how to retrieve The Cancer Genome Atlas (TCGA) data from the Cancer Genomics Cloud (CGC) platform. I downloaded gene expression quantification data, created a relational database with PostgreSQL, and created a dataset uniting the raw quantification data for 675 differentially expressed genes identified …


Continue reading

Differential Expression Analysis with edgeR in R

Posted on Mon 26 October 2020 in R • Tagged with Bioinformatics, gene expression, edgeR

Introduction

In my previous post I demonstrated how to organize the CGC prostate cancer data to a format suited to differential expression analysis (DEA).

Nowadays, DEA usually arises from high-throughput sequencing of a collection (library) of RNA molecules expressed by single cells or tissue given their conditions upon collection and …


Continue reading

Data manipulation with R

Posted on Mon 19 October 2020 in R • Tagged with Bioinformatics, gene expression, SQL, PostgreSQL

Introduction

In my previous post I demonstrated how to obtain a prostate cancer dataset with genomic information in the form of gene expression quantification and created a local PostgreSQL database to hold the data.

Here, I will use R to connect to the PostgreSQL database, retrieve and then prepare the …


Continue reading