The Social Genome Project

1. What is the main idea?

By scope and intent of the Social Genome Project is a data-rich knowledge base for researchers, professionals, business leaders, and government officials  to capture interesting entities and relationships, as detailed below:

  • Human activities leave digital traces in various government data systems, which can collectively capture our social genome, the footprints of our society.
  • Data become most powerful when integrated. Fragmented efforts to make government data available to the public are neither effective nor efficient. Furthermore, current privacy protection mechanisms are ineffective while making data less useful.
  • This project will help make government data publicly available for socially beneficial research such as ‘What is the impact of education in low-resource schools on rates of subsequent arrest/incarceration?’
  • Like the human genome, the social genome data has much buried in the massive almost chaotic data
  • If properly analyzed and interpreted, this social genome could offer crucial insights into many of the most challenging problems facing our society (i.e. affordable and accessible quality healthcare, economics, education, employment, and welfare)

 2. Why will it work?

Companies already monitor our activities to maximize profit. It’s time to use datamining technology for the worthy goal of understanding and solving the problems of society. A transparent data system can provide a rich source of information for population informatics.

The Social Genome Project will facilitate use of such data by building an infrastructure of tools and techniques required for such research while designing privacy protection into the infrastructure itself. This project envisions a totally transparent glass building with all the data and tools required to use the data where all activities can be monitored by the public at all times.

3. What are the main challenges?

The main building blocks of a Social Genome Data Center are

  1. A data infrastructure (i.e. computer system) that can provide secure and appropriate access to data
  2. Tools (i.e. software, data, documentation) to facilitate accurate and ethical use of data
  3. Oversight mechanisms (i.e. IRB) ensuring that the data are only used for societal benefits

4. How would you fund/sustain the project?

Government agencies are under increasing pressure to be more transparent and to use data in everyday decision making, but they have sparse expertise to do so.

By building a common data library accessible to approved people, agencies can more readily collaborate with experts and researchers to turn data into policy and action. Those who become experts of such data through these government projects can then use the social genome data to pursue larger grants (i.e. NSF/NIH) to answer fundamental questions about our society. The project will be supported by contracts with government agencies for direct policy and evaluation and grant funding for investigating fundamental questions about the most challenging problems facing our society—healthcare, education, employment, welfare, economics, and the environment.

5. What is it used for ?
The social genome data is critical for the burgeoning field of population informatics, public health informatics, and health informatics.

6. Are there technical papers on the topic ?