Calendar
During the fall 2018 semester, the Computational Social Science (CSS) and the Computational Sciences and Informatics (CSI) Programs have merged their seminar/colloquium series where students, faculty and guest speakers present their latest research. These seminars are free and are open to the public. This series takes place on Fridays from 3-4:30 in Center for Social Complexity Suite which is located on the third floor of Research Hall.
If you would like to join the seminar mailing list please email Karen Underwood.
COMPUTATIONAL RESEARCH AND APPLICATIONS SEMINAR
Robert Axtell, PhD
Computational Social Science Program, Department of Computational and Data Sciences,
College of Science
Department of Economics, College of Humanities and Social Sciences
Krasnow Institute for Advanced Study
George Mason University
Co-Director
Computational Public Policy Lab
Krasnow Institute for Advanced Study and Schar School of Policy and Government
External Professor, Santa Fe Institute (santafe.edu)
External Faculty, Northwestern Institute on Complex Systems (nico.northwestern.edu)
Scientific Advisory Council, Waterloo Institute for Complexity and Innovation (wici.ca)
Ages and Lifetimes of U.S. Firms: Why Businesses Should NOT be Treated Like People
Monday, October 30, 4:30-5:45
Exploratory Hall, Room 3301
ABSTRACT: Over the last 150 years American corporations have acquired many rights associated with individual citizens, such as free speech, the ability to make campaign contributions, and so on. In this talk I will quantify the age-related demographic properties of U.S. business firms and argue that the peculiar nature of firm aging suggests that businesses are very much unlike individual people. Specifically, using data on all 6 million U.S. firms having employees, I document that firm ages are discrete Weibull-distributed while firm lifetimes follow a closely-related distribution. Further, the hazard rates associated with firm survival are monotone declining according to a power law. From this the expected remaining lifetime can be computed and it will be demonstrated that this INCREASES as firms age. Specifically, while a new firm in the U.S. can expect to live for about 15 years, a firm that has survived 50 years can expect to live for 30 more. Finally, conditioning on firm size leads to even more extreme results: increasing firm size by a decade cuts the hazard rate in half. In sum these results suggest that firm aging is very different from biological aging and makes analogies between firms and people both quantitatively inaccurate and qualitatively wrong-headed. Technically, this talk will focus on the application of conventional demographic techniques to economic and financial data, including failure/survival analysis with censored data.
Rob Axtell earned an interdisciplinary Ph.D. degree at Carnegie Mellon University, where he studied computing, social science, and public policy. His teaching and research involves computational and mathematical modeling of social and economic processes. Specifically, he works at the intersection of multi-agent systems computer science and the social sciences, building so-called agent-based models for a variety of market and non-market phenomena.
His research has been published in the leading scientific journals, including Science and the Proceedings of the National Academy of Sciences, USA, and reprised in Nature, and has appeared in top disciplinary journals (e.g., American Economic Review, Computational and Mathematical Organization Theory, Economic Journal), in general interest journals (e.g., PLOS One) and in specialty journals (e.g., Journal of Regulatory Economics, Technology Forecasting and Social Change.) His research has been supported by American philanthropies (e.g., John D. and Catherine T. MacArthur Foundation, Institute for New Economic Thinking) and government organizations (e.g., National Science Foundation, Department of Defense, Small Business Administration, Office of Naval Research, Environmental Protection Agency). Stories about his research have appeared in major magazines (e.g., Economist, Atlantic Monthly, Scientific American, New Yorker, Discover, Wired, New Scientist, Technology Review, Forbes, Harvard Business Review, Science News, Chronicle of Higher Education, Byte, Le Temps Strategique) and newspapers (e.g., Wall St. Journal, Washington Post, Los Angeles Times, Boston Globe, Detroit Free Press, Financial Times). He is co-author of Growing Artificial Societies: Social Science from the Bottom Up (MIT Press) with J.M. Epstein, widely cited as an example of how to apply modern computing to the analysis of social and economic phenomena.
COMPUTATIONAL RESEARCH AND APPLICATIONS SEMINAR
Shane Frasier, Ph.D.
Department of Homeland Security
Data Science and Cybersecurity at the Department of Homeland Security
Monday, TBA, 4:30-5:45
Exploratory Hall, Room 3301
ABSTRACT: Among its many responsibilities, the Department of Homeland Security works to improve the security of the computer networks of the federal government and our nation’s critical infrastructure. This will be a discussion of some of the ways in which that is done, and some of the ways in which data science can contribute to that goal.
COMPUTATIONAL RESEARCH AND APPLICATIONS SEMINAR
Jason M. Kinser, D.Sc.,
Chair Computational & Data Sciences
George Mason University
Image Operators – A World Premiere
Monday, November 6, 4:30-5:45
Exploratory Hall, Room 3301
ABSTRACT: The onslaught of digital detectors has created the ability to capture massive amounts of image data. Analysis techniques have been maturing for decades, but this new flood of image data will tax the foundations of information dissemination. Published descriptions of the image processes often consume much more real estate than does the scripts required to execute the processes. Furthermore, many published descriptions are imprecise. This talk will preview a new mathematical language solely dedicated to the fields of image processing and analysis. This language is coincident with implementations in Python and Matlab, thus there is a one-to-one correspondence between mathematical description and computer execution. This talk will present several examples and culminate with an interactive analysis of image processing protocols.
COMPUTATIONAL RESEARCH AND APPLICATIONS SEMINAR
Benjamin J. Radford, PhD
Principal Data Scientist
Sotera Defense Solutions
Clustering Techniques for Unsupervised Machine Learning
Monday, November 13, 4:30-5:45
Exploratory Hall, Room 3301
Abstract: Cluster analysis represents a broad class of unsupervised algorithms that are applicable to a variety of data science problems. An overview of some clustering models is provided and example use cases for clustering are discussed. Multivariate Gaussian mixture models are then discussed in detail and estimation techniques are outlined. K-selection is also discussed in the context of Gaussian mixture models. The talk concludes with a short discussion about how clustering techniques might be used in the context of cybersecurity.
Dr. Radford is a Principal Data Scientist at Sotera Defense Solutions where he works on data-driven cybersecurity research programs for the Department of Defense. He received his Ph.D. in political science from Duke University in 2016. His research interests include political methodology, security and political conflict, the political implications of cyberspace, and automated event data coding. Dr. Radford’s dissertation demonstrated the semi-automated population of dictionaries for event-coding in novel domains. He is also an avid guitarist.
There will be no Computational Research and Applications Seminar on Monday, November 20.
Happy Thanksgiving!
COMPUTATIONAL RESEARCH AND APPLICATIONS SEMINAR
Juan R. Cebral, Ph.D.
Bioengineering & Mechanical Engineering Departments
Volgenau School of Engineering
George Mason University
Hemodynamics of Cerebral Aneurysms: Helping Diagnosis and Treatment
Monday, November 27, 4:30-5:45
Exploratory Hall, Room 3301
ABSTRACT: We use image-based computational fluid dynamics to model the blood flow in human cerebral arteries and aneurysms with three specific goals: 1) identify hemodynamic conditions that predispose aneurysms for instability and rupture and thus help with more precise selection of patients at high risk, 2) advance the understanding of the disease mechanisms and enable drug based therapies targeting specific pathways of wall degeneration and weakening, and 3) evaluation of devices and procedures to improve treatment planning and long term outcomes. In this talk I will summarize some our recent progress and results along these three lines of research, and discuss future directions.
COMPUTATIONAL RESEARCH AND APPLICATIONS SEMINAR
James Glasbrenner, PhD
Assistant Professor
George Mason University
Reproducible Research & Best Practices for Computational Science
Monday, December 4, 4:30-5:45
Exploratory Hall, Room 3301
ABSTRACT: Have you ever had one of the following thoughts while working on your research?
- I can’t remember where I put that data file.
- I knew what these variables meant when I wrote them last year.
- Did I accidentally delete that email with the final version of our research paper attached?
- Why does my collaborator’s program delete the last row and column of this array before entering the main loop?
If so, then you’re not alone, because “most researchers are never taught the equivalent of basic lab skills for research computing” [1]. This situation persists even as the average scientific researcher devotes as much as 30% of their time developing and 40% of their time using scientific software [2]. Underdeveloped skills in programming, project organization, and documentation can lead to general frustration, productivity losses, an increase in the risk that a researcher won’t be able to reproduce his or her work, and can even result in serious computational errors that invalidate a study’s general conclusions [3]. At the same time, the number of scientific research groups that are integrating data science topics and methods into their programs is increasing at a rapid pace1 , further increasing the overall need to address this disparity. In response, a growing movement of researchers has emerged that are interested in tackling this problem, leading to the creation of organizations like the Software Carpentry Foundation [4], guidelines for reproducible research [5], and suggestions of “best practices” for scientific computing [1, 6, 7]. However, although there is more awareness about these potential solutions than in past years, these ideas are still not common knowledge. In this seminar, I will review the general background behind these ideas and what computational researchers can learn from other fields such as the software industry. Drawing on my own experience with implementing these ideas, I will provide examples of how you can integrate reproducible research ideas into your work using open source tools. Using the “best practices” suggestions as a guide, I will also show ways in which you can better organize your projects and some ways to make your code more readable, and then explain how this can help streamline scientific collaboration. Finally, I will close by reflecting on the role that automation can play in achieving these principles and goals.
References
[1] G. Wilson, J. Bryan, K. Cranston, J. Kitzes, L. Nederbragt, and T. K. Teal, PLoS Comput. Biol. 13, e1005510 (2017).
[2] J. E. Hannay, C. MacLeod, J. Singer, H. P. Langtangen, D. Pfahl, and G. Wilson, in Proc. 2009 31st Int. Conf. Softw. Eng. ICSE Workshops (2009) pp. 1–8.
[3] Z. Merali, Nature 467, 775 (2010).
[4] “Software Carpentry,” .
[5] R. D. Peng, Science 334, 1226 (2011).
[6] G. Wilson, D. A. Aruliah, C. T. Brown, N. P. C. Hong, M. Davis, R. T. Guy, S. H. D. Haddock, K. D. Huff, I. M. Mitchell, M. D. Plumbley, B. Waugh, E. P. White, and P. Wilson, PLoS Biol. 12, e1001745 (2014).
[7] V. Stodden and S. Miguez, J. Open Res. Softw. 2, e21 (2014). 1An arXiv query for all pre-prints with metadata containing the term ”data science” reveals exponential growth, with the number of submissions approximately doubling every year since 2007.
Computational Social Science Research Colloquium /
Colloquium in Computational and Data Sciences
Robert Axtell, Professor
Computational Social Science Program,
Department of Computational and Data Sciences
College of Science
and
Department of Economics
College of Humanities and Social Sciences
George Mason University
Are Cities Agglomerations of People or of Firms? Data and a Model
Friday, September 28, 3:00 p.m.
Center for Social Complexity, 3rd Floor Research Hall
All are welcome to attend.
Abstract: Business firms are not uniformly distributed over space. In every country there are large swaths of land on which there are very few or no firms, coexisting with relatively small areas on which large numbers of businesses are located—these are the cities. Since the dawn of civilization the earliest cities have husbanded a variety of business activities. Indeed, often the raison d’etre for the growth of villages into towns and then into cities was the presence of weekly markets and fairs facilitating the exchange of goods. City theorists of today tend to see cities as amalgams of people, housing, jobs, transportation, specialized skills, congestion, patents, pollution, and so on, with the role of firms demoted to merely providing jobs and wages. Reciprocally, very little of the conventional theory of the firm is grounded in the fact that most firms are located in space, generally, and in cities, specifically. Consider the well-known facts that both firm and city sizes are approximately Zipf distributed. Is it merely a coincidence that the same extreme size distribution approximately describes firm and cities? Or is it the case that skew firm sizes create skew city sizes? Perhaps it is the other way round, that skew cities permit skew firms to arise? Or is it something more intertwined and complex, the coevolution of firm and city sizes, some kind of dialectical interplay of people working in companies doing business in cities? If firm sizes were not heavy-tailed, but followed an exponential distribution instead, say, could giant cities still exist? Or if cities were not so varied in size, as they were not, apparently, in feudal times, would firm sizes be significantly attenuated? In this talk I develop the empirical foundations of this puzzle, one that has been little emphasized in the extant literatures on firms and cities, probably because these are, for the most part, distinct literatures. I then go on to describe a model of individual people (agents) who arrange themselves into both firms and cities in approximate agreement with U.S. data.
Computational Social Science Research Colloquium /
Colloquium in Computational and Data Sciences
Maciej Latek, Chief Technology Officer, trovero.io./
Ph.D. in Computational Social Science 2011
George Mason University
Industrializing multi-agent simulations:
The case of social media marketing, advertising and influence campaigns
Friday, October 12, 3:00 p.m.
Center for Social Complexity, 3rd Floor Research Hall
All are welcome to attend.
Abstract: System engineering approaches required to transition multi-agent simulations out of science into decision support share features with AI, machine learning and application development, but also present unique challenges. In this talk, I will use trovero as an example to illustrate how some of these challenges can be addressed.
As platform to help advertisers and marketers plan and implement campaigns on the social media, trovero is comprised of social network simulations for optimization and automation and network population synthesis used to preserve people’s privacy while maintaining a robust picture of social media communities. Social network simulations forecast campaign outcomes and pick the right campaigns for given KPIs. Simulation is the only viable way to reliably forecast campaign outcomes: Big data methods fail to forecast campaign outcomes, because they are fundamentally unfit for social network data. Network population synthesis enables working with aggregate data without relying on data sharing agreements with social media platforms that are ever more reluctant to share user data with third parties after GDPR and the Cambridge Analytica debacle.
I will outline how these two approaches complement one another, what computational and data infrastructure is required to support them and how workflows and interactions with social media platforms are organized.
Computational Social Science Research Colloquium /
Colloquium in Computational and Data Sciences
J. Brent Williams
Founder and CEO
Euclidian Trust
Improved Entity Resolution as a Foundation for Model Precision
Friday, November 2, 3:00 p.m.
Center for Social Complexity, 3rd Floor Research Hall
All are welcome to attend.
Abstract: Analyzing behavior, identifying and classifying micro-differentiations, and predicting outcomes relies on the establishment of a core foundation of reliable and complete data linking. Whether data about individuals, families, companies, or markets, acquiring data from orthogonal sources results in significant matching challenges. These matching challenges are difficult because attempts to eliminate (or minimize) false positives yields an increase in false negatives. The converse is true also.
This discussion will focus on the business challenges in matching data and the primary and compounded impact on subsequent outcome analysis. Through practical experience, the speaker led the development and first commercialization of novel approach to “referential matching”. This approach leads to a more comprehensive unit data model (patient, customer, company, etc.), which enables greater computational resolution and model accuracy by hyper-accurate linking, disambiguation, and detection of obfuscation. The discussion also covers the impact of enumeration strategies, data obfuscation/hashing, and natural changes in unit data models over time.