data-150

View project on GitHub

Taxonomically, in academia, data science is a subfield of computer science. Personally, I see data science as a tool or a set of tools and methods that a researcher or practitioner of data science applies to data in a particular domain.  This combination is their “craft”, in essence. I, for example, apply data science to geography in the context of international development. This means I use tools such as machine learning on data such as satellite imagery to help solve problems in the international development community. Data science is a domain that can be surprisingly hard to define given its broad applications and newness. I don’t necessarily think there’s one correct definition and, whatever it is, the field will continue to evolve with new technologies, new industries, and/or new issues. There’s also a question of skill ranges. Should the title of “data scientist” be reserved for those with doctoral degrees in the same way it is in physics, for example, and anyone below that is basically something else (like an analyst)?Think about an area of study or field you care about (outside of your formal assignment topic), maybe the major you intend to choose. Based on what you read and know about data science and its related areas (data analysis, stats, and ML), briefly (1-2 paragraphs) write about how you think these disciplines could possibly be used in your field (it’s okay to think ambitiously). If methods are already being used, what are they and to what extent (if you know)? If you are a prospective data science major, what ideas do you have for how data science could be used in ways you believe they are currently not? You have until 10:10.

In the field of Chemistry, when we perform an experiment, we tend to repeat it several times to get a relative precise result. The three goals of Data science is closely related to this process. First, we have to analyze the great amount of data of our experiments and find differences and similarities between them. If we find one group of data totally different from others, we can study the reason and try to improve the experiment. Also, slight differences of the results can represent big differences of the theories. It is worth studying. Second, using Statistic methods, we can try to change some of the conditions and observe the difference of the final results. We can find some patterns and then find the connections between the conditions and results. Repeating this process, we can find casualities, Third, based on the massive data, we may find some rules and make predictions with them. We can set up chemistry models to help us solving chemistry problems like this much faster. It can save scientists lots of time to study further and further.