首页
 

365体育官网

用大数据如何拯救世界?

来源:365体育官网点击:时间:2019-04-12 17:20

Our ability to collect data far outpaces our ability to fully utilize it—yet those data may hold the key to solving some of the biggest global challenges facing us today.

我们搜集信息的能力远远强于分析使用的能力,然而,这些消息可能包含了我们现如今正在面临的全球性挑战的解决办法。

Take, for instance, the frequent outbreaks of waterborne illnesses as a consequence of war or natural disasters. The most recent example can be found in Yemen, where roughly 10,000 new suspected cases of cholera are reported each week—and history is riddled with similar stories. What if we could better understand the environmental factors that contributed to the disease, predict which communities are at higher risk, and put in place protective measures to stem the spread?

比如,战后或自然灾难引起的水源性传播疾病频繁爆发。最近的例子发生在也门,每个星期也门新发现约一万例疑似霍乱病例。而且历史总是相似的。如果我们能更好地理解环境因素对该病的影响,提前预测高风险社区,以保护性方法来阻止源头传播,将会怎么样呢?

用大数据如何拯救世界.png

Answers to these questions and others like them could potentially help us avert catastrophe.

这些问题和其他相似问题的答案可能会潜在地帮助我们阻止灾难。

We already collect data related to virtually everything, from birth and death rates to crop yields and traffic flows. IBM estimates that each day, 2.5 quintillion bytes of data are generated. To put that in perspective: that's the equivalent of all the data in the Library of Congress being produced more than 166,000 times per 24-hour period. Yet we don't really harness the power of all this information. It's time that changed—and thanks to recent advances in data analytics and computational services, we finally have the tools to do it.

我们几乎为每样东西收集数据,从出生率死亡率到粮食变量和交通状况。IBM公司估计每天有2.5个五万亿字节的数据产生。从这个角度来看:这等同于美国国会图书馆每24小时产生的数据的16.6万倍。但我们并不能掌控所有的信息。但由于近来先进的数据分析和计算机服务,我们终于有了改变它的工具。

As a data scientist for Los Alamos National Laboratory, I study data from wide-ranging, public sources to identify patterns in hopes of being able to predict trends that could be a threat to global security. Multiple data streams are critical because the ground-truth data (such as surveys) that we collect is often delayed, biased, sparse, incorrect or, sometimes, nonexistent.

作为洛斯阿拉莫斯国家实验室的数据科学家,我研究来自广泛公共来源的数据,以确定模式,希望能够预测可能对全球安全构成威胁的趋势。多个数据流是至关重要的,因为我们收集的基本事实数据(比如调查)常常是延迟的、有偏见的、稀疏的、不正确的,有时甚至是不存在的。

For example, knowing mosquito incidence in communities would help us predict the risk of mosquito-transmitted disease such as dengue, the leading cause of illness and death in the tropics. However, mosquito data at a global (and even national) scale are not available.

举个例子,了解蚊子在一个社区的叮咬发生率将会帮助我们预测蚊子的传染登革热病的风险,登革热是导致热带地区疾病和死亡的首要原因。然而,目前还没有全球(甚至全国)规模的蚊虫数据。

To address this gap, we're using other sources such as satellite imagery, climate data and demographic information to estimate dengue risk. Specifically, we had success predicting the spread of dengue in Brazil at the regional, state and municipality level using these data streams as well as clinical surveillance data and Google search queries that used terms related to the disease. While our predictions aren't perfect, they show promise. Our goal is to combine information from each data stream to further refine our models and improve their predictive power.

为了弥补这一差距,我们正在利用卫星图像、气候数据和人口信息等其他来源来估计登革热风险。具体来说,我们成功地利用这些数据流、临床监测数据和使用与疾病有关的术语的谷歌搜索查询,预测了登革热在巴西的地区、州和市一级的蔓延。虽然我们的预测并不完美,但它们显示出了希望。我们的目标是将来自每个数据流的信息结合起来,以进一步完善我们的模型并提高它们的预测能力。

Similarly, to forecast the flu season, we have found that Wikipedia and Google searches can complement clinical data. Because the rate of people searching the internet for flu symptoms often increases during their onset, we can predict a spike in cases where clinical data lags.

365体育平台-365体育投注-官方网站(http://www.funiustone.com/365tygw/2019/0412/758.html):用大数据如何拯救世界?

关闭