Calendar
During the fall 2018 semester, the Computational Social Science (CSS) and the Computational Sciences and Informatics (CSI) Programs have merged their seminar/colloquium series where students, faculty and guest speakers present their latest research. These seminars are free and are open to the public. This series takes place on Fridays from 3-4:30 in Center for Social Complexity Suite which is located on the third floor of Research Hall.
If you would like to join the seminar mailing list please email Karen Underwood.
COMPUTATIONAL RESEARCH AND APPLICATIONS SEMINAR
Kirk Borne, PhD
Principal Data Scientist and an Executive Advisor
Booz Allen Hamilton
Some Interesting Applications of Machine Learning Algorithms
Tuesday, October 10, 4:30-5:45
Exploratory Hall, Room 3301
ABSTRACT: I will present a variety of atypical use cases and applications (in science and in business) of some typical textbook machine learning algorithms, including regression, clustering analysis, association mining, time series analysis, and network analysis.
Dr. Borne advises and consults with numerous organizations, agencies, and partners in the use of data and analytics for discovery, decision support, and innovation. Previously, he was Professor at George Mason University (GMU) for 12 years in the CSI and CDS programs, where he did research, taught, and advised students in data science. Prior to that, Dr. Borne spent nearly 20 years supporting data systems activities on NASA space science research programs, including a role as NASA’s Data Archive Project Scientist for the Hubble Space Telescope.
Recently, Dr. Borne was ranked #2 worldwide among all Big Data experts to follow. http://ipfconline.fr/blog/2017/05/22/fine-list-of-50-top-world-big-data-experts-to-follow-in-2017-with-moz-social-score/
COMPUTATIONAL RESEARCH AND APPLICATIONS SEMINAR
Stephen Lockett, PhD
Principal Scientist and Director of the Optical Microscopy and Analysis Laboratory
Frederick National Laboratory for Cancer Research, National Cancer Institute
Transitioning Microscopy from 2D to 3D for Tumor Biology and Pathology
Monday, October 16, 4:30-5:45
Exploratory Hall, Room 3301
Abstract: Discerning the 3D context of individual cells in tissue is fundamental for understanding tissue development, homeostasis in healthy adult organs and tumor emergence. Determining this context, which requires 3D microscopic imaging to 100s of microns depth into tissues, is hampered in multiple ways: (1) penetration of fluorescent dyes deep into tissue, (2) absorption and scattering of light, (3) sufficiently high speed image acquisition and (4) automated image analysis, because datasets are too large for human visual interpretation. Furthermore, in the case of tumors, standard practice in clinical pathology is to only image a thin slice about 5 microns thick, which has several limitations: there is significant inter-observer variability in the diagnosis, the slice can miss the tumor and determining the spatial relationships of cells to each other is incomplete. To address these limitations, we have evaluated tissue clearing protocols, which reduce scattering and spherical aberration, and we have achieved high spatial resolution imaging up 0.35 mm depth utilizing spinning disk confocal microscopy or lightsheet microscopy for high speed image acquisition. The image analysis tasks for 3D images are much more demanding than 2D images for the following reasons: (1) visualization of the entirety of each object (cell or cell nucleus) is not possible in a single display, (2) purely interactive segmentation is impractical even for one object, (3) automatic algorithms are imperfect, and (4) in the case of basic research relatively few cells are imaged increasing the need to analyze each cell as accurately as possible. Consequently, we are working on tools that merge automatic segmentation algorithms, computer vision and human annotation for delineating objects in 3D images. We have developed a graphcut-based algorithm for finding the globally-optimal surface delineating each manually seeded nucleus. The algorithm is restricted to point convex objects, which is generally satisfactory for nuclei, but not for entire cells that can have arbitrary shapes. For whole cell segmentation in 3D, our approach is slice-by-slice based. In the first 2D image slice (approximately through the center of the cell), the user draws an approximate 2D border and the border is automatically optimized using active contour modeling. This optimized border then serves as the approximate border for adjacent slices, which are in turn automatically optimized. The method is accurate except where the cell surface is in the plane of the slice, so a next step is to facilitate segmenting each cell in different orientations. We are starting to utilize this technology to understand the complex interplay between cancer and normal cells, particularly immune system cells and supporting mesenchyme leading to tumor growth or regression in the case of treatment. The research was funded by NCI Contract No. HHSN261200800001E and supported in part by the Intramural Research Program of the National Institutes of Health, National Cancer Institute, Center for Cancer Research.
Stephen Lockett received the Ph.D. degree from the Department of Medicine, Birmingham University, England. He has published over 120 research papers and has received several international awards. His research interests include fluorescence microscopy and the development of analysis software for extracting quantitative information from images.
COMPUTATIONAL RESEARCH AND APPLICATIONS SEMINAR
Bartley Richardson, PhD
Sotera Defense Solutions
Representation of Cyber Knowledge as Discrete Sequences
Monday, October 23, 4:30-5:45
Exploratory Hall, Room 3301
Abstract: As devices continue to gain network connectivity and require less interactivity from human operators, the nature of network transmissions is shifting and data is being created faster and with more fidelity than ever before. One way to view this data is in the context of a cyber language, analogous but semantically/syntactically different than a natural language. After sequences are constructed over a large dataset (PB), unsupervised machine learning and deep learning techniques are used to model communication, identifying typical behavior and flagging unlikely events. This seminar presents context for the foundations of this new approach to cyber anomaly detection as well as the enabling analytic techniques.
Dr. Richardson has nearly a decade of experience in Data Science, Cloud Computing, Software Development, and Machine Learning. He has served as both Department Chair and Visiting Professor at two universities and has published over 10 articles. He is currently serving as a principal data scientist, technical lead, and principal investigator on multiple government sponsored projects, including one DARPA research program.
COMPUTATIONAL RESEARCH AND APPLICATIONS SEMINAR
Robert Axtell, PhD
Computational Social Science Program, Department of Computational and Data Sciences,
College of Science
Department of Economics, College of Humanities and Social Sciences
Krasnow Institute for Advanced Study
George Mason University
Co-Director
Computational Public Policy Lab
Krasnow Institute for Advanced Study and Schar School of Policy and Government
External Professor, Santa Fe Institute (santafe.edu)
External Faculty, Northwestern Institute on Complex Systems (nico.northwestern.edu)
Scientific Advisory Council, Waterloo Institute for Complexity and Innovation (wici.ca)
Ages and Lifetimes of U.S. Firms: Why Businesses Should NOT be Treated Like People
Monday, October 30, 4:30-5:45
Exploratory Hall, Room 3301
ABSTRACT: Over the last 150 years American corporations have acquired many rights associated with individual citizens, such as free speech, the ability to make campaign contributions, and so on. In this talk I will quantify the age-related demographic properties of U.S. business firms and argue that the peculiar nature of firm aging suggests that businesses are very much unlike individual people. Specifically, using data on all 6 million U.S. firms having employees, I document that firm ages are discrete Weibull-distributed while firm lifetimes follow a closely-related distribution. Further, the hazard rates associated with firm survival are monotone declining according to a power law. From this the expected remaining lifetime can be computed and it will be demonstrated that this INCREASES as firms age. Specifically, while a new firm in the U.S. can expect to live for about 15 years, a firm that has survived 50 years can expect to live for 30 more. Finally, conditioning on firm size leads to even more extreme results: increasing firm size by a decade cuts the hazard rate in half. In sum these results suggest that firm aging is very different from biological aging and makes analogies between firms and people both quantitatively inaccurate and qualitatively wrong-headed. Technically, this talk will focus on the application of conventional demographic techniques to economic and financial data, including failure/survival analysis with censored data.
Rob Axtell earned an interdisciplinary Ph.D. degree at Carnegie Mellon University, where he studied computing, social science, and public policy. His teaching and research involves computational and mathematical modeling of social and economic processes. Specifically, he works at the intersection of multi-agent systems computer science and the social sciences, building so-called agent-based models for a variety of market and non-market phenomena.
His research has been published in the leading scientific journals, including Science and the Proceedings of the National Academy of Sciences, USA, and reprised in Nature, and has appeared in top disciplinary journals (e.g., American Economic Review, Computational and Mathematical Organization Theory, Economic Journal), in general interest journals (e.g., PLOS One) and in specialty journals (e.g., Journal of Regulatory Economics, Technology Forecasting and Social Change.) His research has been supported by American philanthropies (e.g., John D. and Catherine T. MacArthur Foundation, Institute for New Economic Thinking) and government organizations (e.g., National Science Foundation, Department of Defense, Small Business Administration, Office of Naval Research, Environmental Protection Agency). Stories about his research have appeared in major magazines (e.g., Economist, Atlantic Monthly, Scientific American, New Yorker, Discover, Wired, New Scientist, Technology Review, Forbes, Harvard Business Review, Science News, Chronicle of Higher Education, Byte, Le Temps Strategique) and newspapers (e.g., Wall St. Journal, Washington Post, Los Angeles Times, Boston Globe, Detroit Free Press, Financial Times). He is co-author of Growing Artificial Societies: Social Science from the Bottom Up (MIT Press) with J.M. Epstein, widely cited as an example of how to apply modern computing to the analysis of social and economic phenomena.
COMPUTATIONAL RESEARCH AND APPLICATIONS SEMINAR
Shane Frasier, Ph.D.
Department of Homeland Security
Data Science and Cybersecurity at the Department of Homeland Security
Monday, TBA, 4:30-5:45
Exploratory Hall, Room 3301
ABSTRACT: Among its many responsibilities, the Department of Homeland Security works to improve the security of the computer networks of the federal government and our nation’s critical infrastructure. This will be a discussion of some of the ways in which that is done, and some of the ways in which data science can contribute to that goal.
COMPUTATIONAL RESEARCH AND APPLICATIONS SEMINAR
Jason M. Kinser, D.Sc.,
Chair Computational & Data Sciences
George Mason University
Image Operators – A World Premiere
Monday, November 6, 4:30-5:45
Exploratory Hall, Room 3301
ABSTRACT: The onslaught of digital detectors has created the ability to capture massive amounts of image data. Analysis techniques have been maturing for decades, but this new flood of image data will tax the foundations of information dissemination. Published descriptions of the image processes often consume much more real estate than does the scripts required to execute the processes. Furthermore, many published descriptions are imprecise. This talk will preview a new mathematical language solely dedicated to the fields of image processing and analysis. This language is coincident with implementations in Python and Matlab, thus there is a one-to-one correspondence between mathematical description and computer execution. This talk will present several examples and culminate with an interactive analysis of image processing protocols.
COMPUTATIONAL RESEARCH AND APPLICATIONS SEMINAR
Benjamin J. Radford, PhD
Principal Data Scientist
Sotera Defense Solutions
Clustering Techniques for Unsupervised Machine Learning
Monday, November 13, 4:30-5:45
Exploratory Hall, Room 3301
Abstract: Cluster analysis represents a broad class of unsupervised algorithms that are applicable to a variety of data science problems. An overview of some clustering models is provided and example use cases for clustering are discussed. Multivariate Gaussian mixture models are then discussed in detail and estimation techniques are outlined. K-selection is also discussed in the context of Gaussian mixture models. The talk concludes with a short discussion about how clustering techniques might be used in the context of cybersecurity.
Dr. Radford is a Principal Data Scientist at Sotera Defense Solutions where he works on data-driven cybersecurity research programs for the Department of Defense. He received his Ph.D. in political science from Duke University in 2016. His research interests include political methodology, security and political conflict, the political implications of cyberspace, and automated event data coding. Dr. Radford’s dissertation demonstrated the semi-automated population of dictionaries for event-coding in novel domains. He is also an avid guitarist.
There will be no Computational Research and Applications Seminar on Monday, November 20.
Happy Thanksgiving!
COMPUTATIONAL RESEARCH AND APPLICATIONS SEMINAR
Juan R. Cebral, Ph.D.
Bioengineering & Mechanical Engineering Departments
Volgenau School of Engineering
George Mason University
Hemodynamics of Cerebral Aneurysms: Helping Diagnosis and Treatment
Monday, November 27, 4:30-5:45
Exploratory Hall, Room 3301
ABSTRACT: We use image-based computational fluid dynamics to model the blood flow in human cerebral arteries and aneurysms with three specific goals: 1) identify hemodynamic conditions that predispose aneurysms for instability and rupture and thus help with more precise selection of patients at high risk, 2) advance the understanding of the disease mechanisms and enable drug based therapies targeting specific pathways of wall degeneration and weakening, and 3) evaluation of devices and procedures to improve treatment planning and long term outcomes. In this talk I will summarize some our recent progress and results along these three lines of research, and discuss future directions.
COMPUTATIONAL RESEARCH AND APPLICATIONS SEMINAR
James Glasbrenner, PhD
Assistant Professor
George Mason University
Reproducible Research & Best Practices for Computational Science
Monday, December 4, 4:30-5:45
Exploratory Hall, Room 3301
ABSTRACT: Have you ever had one of the following thoughts while working on your research?
- I can’t remember where I put that data file.
- I knew what these variables meant when I wrote them last year.
- Did I accidentally delete that email with the final version of our research paper attached?
- Why does my collaborator’s program delete the last row and column of this array before entering the main loop?
If so, then you’re not alone, because “most researchers are never taught the equivalent of basic lab skills for research computing” [1]. This situation persists even as the average scientific researcher devotes as much as 30% of their time developing and 40% of their time using scientific software [2]. Underdeveloped skills in programming, project organization, and documentation can lead to general frustration, productivity losses, an increase in the risk that a researcher won’t be able to reproduce his or her work, and can even result in serious computational errors that invalidate a study’s general conclusions [3]. At the same time, the number of scientific research groups that are integrating data science topics and methods into their programs is increasing at a rapid pace1 , further increasing the overall need to address this disparity. In response, a growing movement of researchers has emerged that are interested in tackling this problem, leading to the creation of organizations like the Software Carpentry Foundation [4], guidelines for reproducible research [5], and suggestions of “best practices” for scientific computing [1, 6, 7]. However, although there is more awareness about these potential solutions than in past years, these ideas are still not common knowledge. In this seminar, I will review the general background behind these ideas and what computational researchers can learn from other fields such as the software industry. Drawing on my own experience with implementing these ideas, I will provide examples of how you can integrate reproducible research ideas into your work using open source tools. Using the “best practices” suggestions as a guide, I will also show ways in which you can better organize your projects and some ways to make your code more readable, and then explain how this can help streamline scientific collaboration. Finally, I will close by reflecting on the role that automation can play in achieving these principles and goals.
References
[1] G. Wilson, J. Bryan, K. Cranston, J. Kitzes, L. Nederbragt, and T. K. Teal, PLoS Comput. Biol. 13, e1005510 (2017).
[2] J. E. Hannay, C. MacLeod, J. Singer, H. P. Langtangen, D. Pfahl, and G. Wilson, in Proc. 2009 31st Int. Conf. Softw. Eng. ICSE Workshops (2009) pp. 1–8.
[3] Z. Merali, Nature 467, 775 (2010).
[4] “Software Carpentry,” .
[5] R. D. Peng, Science 334, 1226 (2011).
[6] G. Wilson, D. A. Aruliah, C. T. Brown, N. P. C. Hong, M. Davis, R. T. Guy, S. H. D. Haddock, K. D. Huff, I. M. Mitchell, M. D. Plumbley, B. Waugh, E. P. White, and P. Wilson, PLoS Biol. 12, e1001745 (2014).
[7] V. Stodden and S. Miguez, J. Open Res. Softw. 2, e21 (2014). 1An arXiv query for all pre-prints with metadata containing the term ”data science” reveals exponential growth, with the number of submissions approximately doubling every year since 2007.