Sunday, June 2, 2024

5 Ways Big Data can Help us

 by Lorena Salvado, San Marin High School 

Data storage centers


Big Data, or large data sets are used to train machines and programs (using specific algorithms) to identify and solve problems in our world. Many machine learning algorithms focus on identifying images and recognizing early symptoms of many illnesses, helping patients and doctors alike. Many other algorithms work around predicting our voting and economic futures- each equally important. All of these algorithms are fueled by large data sets, or Big Data, which help train the machine ( or AI) and help it in making more accurate predictions and decisions. 

Many researchers, such as Abby Smith who came to Marin Science Seminar and talked about the benefits of Big Data in her talk :“Data Science for Social Good”, use these algorithms to predict and help marginalized communities. These data collections and datasets help many other organizations in determining help to deliver as well as keeping track of the issues within these communities. 

So, given that there are many opportunities to help using Big Data, here are 5 different and positive ways Big Data is being used to help those who need it. 


1. Tracking and Helping the Homelessness Crisis

One of the main projects Smith worked on, was a predictive and preventative program for individuals that were at close future risk of homelessness in Allegheny County. Her work involved taking data from numerous individuals as well as taking previous homelessness data and patterns and applying it to the current situations. From these programs she can understand who is at a higher risk of near homelessness, and from there the Allegheny County Department of Human Services could intervene and further help these people. This was a very impactful project as it helped many people that were at risk of homelessness. 

2. Local data for Local people 

The Native BioData Consortium (NBDC) is a database run by Native Americnas focused on gathering information about Native Americans. This big data ranges from public healthy COVID-19 surveillance and tracking to the study and documentation of chronic diseases and effects of medicine. This research and data helps foster trust within these communities as well as incorporate more aspects of their lives into science (by incorporating biological, ecological, sociological factors into the data alongside the data itself). It is also used to encourage and teach future Native American scientists and improve their tribes and communities. 

3. Language Preservation and Research 

Language preservation has become a big issue in the recent decades. As predicted by UNESCO, 90% of the world’s languages could disappear within the next 80 years. This means that many small indigenous communities will lose their languages, due to assimilation or benign overtaken in importance and significance by outside cultures and languages. Languages are a very important part of culture, with many aspects of the culture being carried by the spoken or written language. To preserve these languages many have launched large Data Science for Good projects, such as Google’s Wallaroo project. These projects collect large data about endangered languages, their syntax, their phonetic sounds, and their cultural importance. Documenting these endangered languages helps us preserve our cultures and use machine learning algorithms to teach it back to future generations, or simply to understand the language patterns better. Another example of language preservation can be seen in the work done for extinct languages. For example, many Native American languages were lost due to colonization and years of forced assimilation. Many of these languages did not have written forms for us to historically trace back and find. Many language based algorithms are using the little oral data that there is on these languages to predict and “fill in” ( phonetically, grammatically ect) what these languages might have sounded like. This helps many historians and Native communities as they have better chances at recovering their languages and cultures. 

4. Track and prevent Avian Influenza Risk 

During the 2020/2021 Avian influenza outbreak, a team in Korea decided to make a large database containing cases of avian influenza as well as its early stages. This data was later used in many algorithms across the country to help identify and track cases of avian influenza, which helped mitigate the infections and the effects of the illness. Today, with a widespread outbreak of avian influenza going around - with cases even spreading outside of the avian world ( such as cattle and even humans now ! ) large data and algorithms like these are needed. By having a lot of information on the patterns and spread of illnesses, we can better protect our own communities and lessen their spread . 

5. Help Businesses target their consumers better

This is by far one of the most widespread uses of Big Data in our current, modern world. Many businesses ranging from social media platforms, to small business, to large companies all seeking to advertise and adapt to their consumer base  use some sort of data collection systems to keep track of their consumericus. Many businesses have a certain demographic of people in mind to sell their products to ( ex: A business selling paints will want to market their products to other painters or artists looking for paint ) and collecting big data about individual users on platforms or internet browsing helps these businesses target their advertisements in a more accurate way. 


From preventing and tracking homelessness factors, to helping businesses run smoothly and make more profit, to predicting and keeping track of global diseases and preserving and predicting historical and lesser spoken languages, big data and algorithms play a huge role in our world. More and more data is produced every day, and every machine learning algorithm becomes more accurate and more efficient than the last one. These crucial algorithms can help not only our world, in the domains of business, finance, and homelessness, but also the world that we live in, tracking avian flu cases, and predicting natural disasters or helping spaceflights run smoother. And of course, these are just some of the ways that large data sets can be used to help people and improve algorithms. A few more examples can be seen in depth in Abby Smith’s talk at Marin Science Seminar, which can be seen here : "Data Science for Social Good" 

More information on Abby Smith’s work: https://marinscienceseminar.com/data-science-for-social-good/  , https://abbylsmith.me/  ,https://www.dssgfellowship.org/

Visit our future Marin Science Seminar talks on the official Marin Science Seminar Website : https://marinscienceseminar.com/








Crochet Seagull: Studying Sea Birds Seminar

  By Sahiti Namburu, Terra Linda High School After going to the Marine Wildlife off our Coasts: Studying Sea birds, Marine Mammals and More ...

About Us

Marin Science Seminar is a one-hour science lecture/presentation with a question and answer period open to all interested local teenagers, educators and community. Seminar sessions are held 12 Wednesday evenings during the school year, from 7:30 to 8:30 pm in the Innovation Hub at Terra Linda High School, 320 Nova Albion Way, San Rafael. Seminar speakers are scientists, mathematicians, engineers, physicians, technologists and computer programmers. The topics presented are in a specific area of the speaker’s expertise, geared to interested high school students.