Publications

2020-10-27

Electrochemical SARS-CoV-2 Sensing at Point-of-Care and Artificial Intelligence for Intelligent COVID-19 Management

To manage the COVID-19 pandemic, development of rapid, selective, sensitive diagnostic systems for early stage β-coronavirus severe acute respiratory syndrome (SARS-CoV-2) virus protein detection is emerging as a necessary response to generate the bioinformatics needed for efficient smart diagnostics, optimization of therapy, and investigation of therapies of higher efficacy. The urgent need for such diagnostic systems is recommended by experts in order to achieve the mass and targeted SARS-CoV-2 detection required to manage the COVID-19 pandemic through the understanding of infection progression and timely therapy decisions. To achieve these tasks, there is a scope for developing smart sensors to rapidly and selectively detect SARS-CoV-2 protein at the picomolar level. COVID-19 infection, due to human-to-human transmission, demands diagnostics at the point-of-care (POC) without the need of experienced labor and sophisticated laboratories. Keeping the above-mentioned considerations, we propose to explore the compartmentalization approach by designing and developing nanoenabled miniaturized electrochemical biosensors to detect SARS-CoV-2 virus at the site of the epidemic as the best way to manage the pandemic. Such COVID-19 diagnostics approach based on a POC sensing technology can be interfaced with the Internet of things and artificial intelligence (AI) techniques (such as machine learning and deep learning for diagnostics) for investigating useful informatics via data storage, sharing, and analytics. Keeping COVID-19 management related challenges and aspects under consideration, our work in this review presents a collective approach involving electrochemical SARS-CoV-2 biosensing supported by AI to generate the bioinformatics needed for early stage COVID-19 diagnosis, correlation of viral load with pathogenesis, understanding of pandemic progression, therapy optimization, POC diagnostics, and diseases management in a personalized manner.

ACS Applied Bio Materials, Vol 3, 11 (2020).

2020-10-19

Gopal Pandurangan

Scientific Computing

PandaSQL: Parallel Randomized Triangle Enumeration with SQL Queries

Triangles are an important pattern in large-scale graph analysis for their practical use in many real-life applications. However, with the expansion of networks, maintaining a balanced computational load is challenging especially for problems like triangle computations because of skewed vertices. On the other hand, there is a huge amount of data in database management systems (DBMSs) that can be modeled and analyzed as graphs. With these motivations in mind, we developed PandaSQL, a novel approach using SQL queries to enumerate all the triangles in a given graph based on Randomized Triangle Enumeration Algorithm. Our approach is elegant, abstract, and short compared to traditional languages like C++ or Python. Moreover, our partitioning queries ensures perfect load balancing. Thus, the triangle enumeration is independent, local, and parallel.

CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management

2020-10-14

Natural Language Processing

Using Computational Text Analysis Tools to Study African Online News Content.

After radio and television, online media are fast becoming a primary source of information for many Africans. With this increase, it is becoming necessary for media researchers to explore ways to better understand production, content and reception patterns of online news in the continent. This paper introduces freely available tools for systematic and (semi-)automated collection, storage and analysis of digital news that builds on recent advances in the computational power of personal computers, and the decreasing costs of storing large amounts of data. I start by describing existing challenges in the collection of online news text data, including the limited amount of African news content in commercial databases, and the methodological shortcomings of using commercial search engines. Then, I present a four-stage approach using packages written in the open-source R programming language to automate the collection of online news content (web scraping); transform this content for easier storage and analysis (data processing); use computational text analysis tools to describe and categorise data; and present the results in ways that are easier to understand (data visualisation). The paper concludes with a summary of recommendations for using computational methods to study African communication phenomena.

African Journalism Studies, 41(4), 68–82.

2020-07-29

Richard Meisel

Scientific Computing

The maintenance of polygenic sex determination depends on the dominance of fitness effects which are predictive of the role of sexual antagonism

In species with polygenic sex determination, multiple male- and/or female-determining loci on different proto-sex chromosomes segregate as polymorphisms within populations. The extent to which these polymorphisms are stable equilibria is not yet resolved. Previous work demonstrated that polygenic sex determination is most likely to be maintained as a stable polymorphism when the proto-sex chromosomes have opposite (sexually antagonistic) fitness effects in males and females. However, these models usually consider polygenic sex determination systems with only two proto-sex chromosomes, or they do not broadly consider the dominance of the variants under selection. To address these shortcomings, I used forward population genetic simulations to identify selection pressures that can maintain polygenic sex determination under different dominance scenarios in a system with more than two proto-sex chromosomes (modeled after the house fly). I found that overdominant fitness effects of male-determining proto-Y chromosomes in males are more likely to maintain polygenic sex determination than dominant, recessive, or additive fitness effects. I also found that additive fitness effects that maintain polygenic sex determination have the strongest signatures of sexually antagonistic selection, but there is also some evidence for sexually antagonism when fitness effects of proto-Y chromosomes are dominant or recessive. More generally, these results suggest that the expected effect of sexually antagonistic selection on the maintenance of genetic variation in natural populations will depend on whether the alleles are sex-linked and the dominance of their fitness effects.

bioRxiv 2020.07.08.193516.

2020-07-23

Jiajia Sun

Image Analysis, Visualization

Unveiling the 3D undercover structure of a Precambrian intrusive complex by integrating airborne magnetic and gravity gradient data into 3D quasi-geology model building

Mineral exploration under a thick sedimentary cover naturally relies on geophysical methods. We have used high-resolution airborne magnetic and gravity gradient data over northeast Iowa to characterize the geology of the concealed Precambrian rocks and evaluate the prospectivity of mineral deposits. Previous researchers have interpreted the magnetic and gravity gradient data in the form of a 2D geologic map of the Precambrian basement rocks, which provides important geophysical constraints on the geologic history and mineral potentials over the Decorah area located in the northeast of Iowa. However, their interpretations are based on 2D data maps and are limited to the two horizontal dimensions. To fully tap into the rich information contained in the high-resolution airborne geophysical data, and to further our understanding of the undercover geology, we have performed separate and joint inversions of magnetic and gravity gradient data to obtain 3D density contrast models and 3D susceptibility models, based on which we carried out geology differentiation. Based on separately inverted physical property values, we have identified 10 geologic units and their spatial distributions in 3D which are all summarized in a 3D quasi-geology model. The extension of 2D geologic interpretation to 3D allows for the discovery of four previously unidentified geologic units, a more detailed classification of the Yavapai country rock, and the identification of the highly anomalous core of the mafic intrusions. Joint inversion allows for the classification of a few geologic units further into several subclasses. We have demonstrated the added value of the construction of a 3D quasi-geology model based on 3D separate and joint inversions.

Interpretation, Volume 8, Issue 4 (2020).

2020-07-16

Lars Grabow

ML / AI

Accelerated Modeling of Lithium Diffusion in Solid State Electrolytes using Artificial Neural Networks

Previous efforts to understand structure-function relationships in high ionic conductivity materials for solid state batteries have predominantly relied on density functional theory (DFT-) based ab initio molecular dynamics (MD). Such simulations, however, are computationally demanding and cannot be reasonably applied to large systems containing more than a hundred atoms. Here, an artificial neural network (ANN) is trained to accelerate the calculation of high accuracy atomic forces and energies used during such MD simulations. After carefully training a robust ANN for four and five element systems, nearly identical lithium ion diffusivities are obtained for Li10GeP2S12 (LGPS) when benchmarking the ANN-MD results with DFT-MD. Applying the ANN-MD approach, the effect of chlorine doping on the lithium diffusivity is calculated in an LGPS-like structure and it is found that a dopant concentration of 1.3% maximizes ionic conductivity. The optimal concentration balances the competing consequences of effective atomic radii and dielectric constants on lithium diffusion and agrees with the experimental composition. Performing simulations at the resolution necessary to model experimentally relevant and optimal concentrations would be infeasible with traditional DFT-MD. Systems that require a large number of simulated atoms can be studied more efficiently while maintaining high accuracy with the proposed ANN-MD framework.

Advanced Theory and Simulations 3(9).

2020-07-10

Lars Grabow

Scientific Computing

Enhancing Technological Applications through Density Functional Theory Modeling of Nanomaterials

Host–guest interactions are crucial in a diverse list of applications, among which gas adsorption and sensing have garnered much interest. To this end, the gas adsorption potential of a metal core–shell-based structure ([Cu12FeK3O]6), Li-doped carbon nanotubes, and the sensing potential of a Cu-based metal–organic framework, lanthanide-doped oxides, have been examined. Finally, nanoparticles can also benefit human health and the environment with diverse examples, such as stimulating growth through the transport of zeatin inside plant cells using Au nanoparticles, sensing and carrying neurotransmitters using boron nitride nanoribbons, and providing antibacterial activity through controlled Ag+ release from Ag embedded in SiO2.

ACS Applied Nano Materials, 3, 7 (2020).

2020-07-02

Jakoah Brgoch

ML / AI

Machine learning 5d-level centroid shift of Ce3+ inorganic phosphors

Information on the 5d level centroid shift (ɛc) of rare-earth ions is critical for determining the chemical shift and the Coulomb repulsion parameter as well as predicting the luminescence and thermal response of rare-earth substituted inorganic phosphors. The magnitude of ɛc depends on the binding strength between the rare-earth ion and its coordinating ligands, which is difficult to quantify a priori and makes phosphor design particularly challenging. In this work, a tree-based ensemble learning algorithm employing extreme gradient boosting is trained to predict ɛc by analyzing the optical properties of 160 Ce3+ substituted inorganic phosphors. The experimentally measured ɛc of these compounds was featurized using the materials' relative permittivity (ɛr), average electronegativity, average polarizability, and local geometry. Because the number of reported ɛr values is limited, it was necessary to utilize a predicted relative permittivity (ɛr,SVR) obtained from a support vector regressor trained on data from ∼2800 density functional theory calculations. The remaining features were compiled from open-source databases and by analyzing the rare-earth coordination environment from each Crystallographic Information File. The resulting ensemble model could reliably estimate ɛc and provide insight into the optical properties of Ce3+-activated inorganic phosphors.

Journal of Applied Physics 128, 013104 (2020).

2020-07-01

Badri Roysam

ML / AI

Artificial intelligence and machine learning in nephropathology

Artificial intelligence (AI) for the purpose of this review is an umbrella term for technologies emulating a nephropathologist’s ability to extract information on diagnosis, prognosis, and therapy responsiveness from native or transplant kidney biopsies. Although AI can be used to analyze a wide variety of biopsy-related data, this review focuses on whole slide images traditionally used in nephropathology. AI applications in nephropathology have recently become available through several advancing technologies, including (i) widespread introduction of glass slide scanners, (ii) data servers in pathology departments worldwide, and (iii) through greatly improved computer hardware to enable AI training. In this review, we explain how AI can enhance the reproducibility of nephropathology results for certain parameters in the context of precision medicine using advanced architectures, such as convolutional neural networks, that are currently the state of the art in machine learning software for this task. Because AI applications in nephropathology are still in their infancy, we show the power and potential of AI applications mostly in the example of oncopathology. Moreover, we discuss the technological obstacles as well as the current stakeholder and regulatory concerns about developing AI applications in nephropathology from the perspective of nephropathologists and the wider nephrology community. We expect the gradual introduction of these technologies into routine diagnostics and research for selective tasks, suggesting that this technology will enhance the performance of nephropathologists rather than making them redundant.

Kidney International (Volume 98, Issue 1, July 2020).

2020-06-06

Gopal Pandurangan

Scientific Computing

Efficient Distributed Algorithms for the K-Nearest Neighbors Problem

The K-nearest neighbors is a basic problem in machine learning with numerous applications. In this problem, given a (training) set of n data points with labels and a query point q, we want to assign a label to q based on the labels of the K-nearest points to the query. We study this problem in the k-machine model, a model for distributed large-scale data. In this model, we assume that the n points are distributed (in a balanced fashion) among the k machines and the goal is to compute an answer given a query point to a machine using a small number of communication rounds.

Our main result is a randomized algorithm in the k-machine model that runs in O(log K) communication rounds with high success probability (regardless of the number of machines k and the number of points n). The message complexity of the algorithm is small taking only O(k log K) messages. Our bounds are essentially the best possible for comparison-based algorithms. We also implemented our algorithm and show that it performs well in practice.

SPAA '20: Proceedings of the 32nd ACM Symposium on Parallelism in Algorithms and Architectures.