Category Archives: Science

Retrieval-based Deep Learning with TensorFlow v1.0+ and Python

Hi there. In this post I will cover code from a Github repo that I forked (detailed in this post) that trains a machine learning model based on IRC chat logs (the Ubuntu Dialog Corpus) to select the correct response out of a set of potential responses, given a context. The code was created last year with… Read More »

Cleaning Transcript Data with Python

Performing an analysis of text data or using text data to train machine learning models oftentimes requires a lot of data. Usually people look to Wikipedia for large amounts of text data, but occasionally scholars will make use of less traditional sources of data, like movie reviews for performing sentiment analysis on sentences or Ubuntu IRC chat… Read More »

6 Fascinating Distributed Computing Projects

In this post, I’m going to cover some scientific distributed computing projects coordinated through the BOINC and @home distributed networks. For an introduction to what distributed computing is, read this post and maybe the Wikipedia page. Essentially, the BOINC software sends your computer work units to complete, which are sent back to headquarters and combined… Read More »

Turn Your Computer Into a Science Lab

After having learned a little bit about Folding@Home(F@H) over the years, I found myself ready and willing to join a team and try it out. Folding@Home is a non-profit distributed computing project operated by the Pande lab at Stanford university. Essentially, the software simulates potential ways a protein can be folded, and once complete, sends the “work unit” back… Read More »