x
Que es el Data Science

Que es el Data Science?

Nueva disciplina Emergente

90%

De los datos mundiales  han sido producido en los dos últimos años.

Explosión de datos

Los datos son cada vez más accesible y ubicuos. Ahora estamos digitalizando contenidos analógicos que se ha creado durante siglos y recolectando innumerables y nuevos tipos de datos de registrados en la web, dispositivos móviles, sensores, instrumentos y transacciones. IBM estima que el 90 por ciento de los datos en el mundo de hoy se ha creado en los últimos dos años.

Al mismo tiempo, las nuevas tecnologías están surgiendo para organizar y dar sentido a esta avalancha de datos. Ahora podemos identificar patrones y regularidades en los datos de todo tipo que nos permitan el avance de becas, mejorar la condición humana, crear valor comercial y social. El auge de la “Big data” tiene el potencial para profundizar nuestro entendimiento de los fenómenos que van desde los sistemas físicos, biológicos hasta el comportamiento social y económico.*

Un Cambio Identificado

Virtually every sector of the economy now has access to more data than would have been imaginable even a decade ago. Businesses today are accumulating new data at a rate that exceeds their capacity to extract value from it. The question facing every organization that wants to attract a community is how to use data effectively — not just their own data, but all of the data that’s available and relevant.

“This hot new field promises to revolutionize industries from business to government, health care to academia.”

The New York Times

Our ability to derive social and economic value from the newly available data is limited by the lack of expertise. Working with this data requires distinctive new skills and tools. The corpuses are often too voluminous to fit on a single computer, to manipulate with traditional databases or statistical tools, or to represent using standard graphics software. The data is also more heterogeneous than the highly curated data of the past. Digitized text, audio, and visual content, like sensor and weblog data, is typically messy, incomplete, and unstructured; it is often of uncertain provenance and quality; and frequently must be combined with other data to be useful. Working with user-generated data sets also raises challenging issues of privacy, security, and ethics.

The field of data science is emerging at the intersection of the fields of social science and statistics, information and computer science, and design.

Request more information about datascience@berkeley, or speak with an admissions counselor at 855-678-MIDS.

*There is no agreed upon definition for “big data.” The tools of data science are as appropriate for gigabyte as they are for petabyte scale datasets. “Big data” typically refers to data on the scale of terabytes (10 to the 12th power) and petabytes (10 to the 15th power). A petabyte is a million gigabytes.

Leave a reply