Shared Analytics Across Secure and Unsecured Networks by Admin

bm_neural.png

Next week I will be attending an IEEE working group meeting on IEEE P2795. This standard identifies the requirements for using shared analytics over secured and unsecured networks. It establishes a consistent method of using an overarching interoperability framework to utilize one or more disparate data systems for analytic purposes without an analytic user having explicit access to or sharing the data within these systems.

This standard allows for a high assurance method of sharing access to information for analysis without moving data beyond firewall protection. It facilitates sharing virtual access to aggregate data without the need for direct access to personal health information (PHI), personally identifiable information (PII), or other sensitive data. The standard supports a scenario where an entity (institution and/or technology) with an analytic might wish to analyze data stored at another entity (institution and/or technology) within appropriate cyber security constraints.

Defining the Problem

While this standard deals directly with personal health information (PHI) and personally identifiable information (PII) in healthcare data, the solutions being explored here are applicable to data science problems in the ODNI agencies….how can we enable Data Science/Analytics on data and still protect it? Metadata tagging - information about the data - is crucial to solving the data lake security problem and is the linchpin to the effective management of data throughout the data science/analytics lifecycle.

Data Science/Analytics has changed the way we use data. Effective use of a data science platform is entirely dependent on access to data. One of the core ideas behind the practice of data science is unification. Smaller insights are aggregated into larger patterns, shedding light on opportunities to solve a problem. When structural barriers exist, even the most sophisticated algorithms will reach an impasse. No greater barrier to analysis exists than data silos.

Currently, the most sensitive data in the intelligence community is protected by physical and logical boundaries, with classification applied to the data at the silo level. This type of arrangement inhibits functional data science, which needs access to the entire data estate. Combing data with different security levels; placing data that has a higher security level in a pool with data that has a lower security level creates another challenge. To enable the pooling of data with different classification levels a more precise way to control security and classification at the object and file level is needed. Each file or object will need to have a security level applied to it, and each object or file will need to be able to “self protect”, automatically enforcing security requirements. There are petabytes of data within the ODNI’s control.  Only when each object is capable of “self protecting” so data sources can pooled will large scale insights be unlocked by the talented Data Scientists working on behalf of the intelligence community.



 

 


Learn More

  • The graphic for this article comes from the Asimov Institute site on a page called the “Neural Network Zoo”. The graphic describes a Boltzmann Machine. Some neurons are marked as input neurons and others remain “hidden”. The input neurons become output neurons at the end of a full network update. It starts with random weights and learns through back-propagation, or more recently through contrastive divergence (a Markov chain is used to determine the gradients between two informational gains). Compared to a HN, the neurons mostly have binary activation patterns. As hinted by being trained by MCs, BMs are stochastic networks. The training and running process of a BM is fairly similar to a HN: one sets the input neurons to certain clamped values after which the network is set free (it doesn’t get a sock). While free the cells can get any value and we repetitively go back and forth between the input and hidden neurons. The activation is controlled by a global temperature value, which if lowered lowers the energy of the cells. This lower energy causes their activation patterns to stabilise. The network reaches an equilibrium given the right temperature.

    The original paper can be found here.

    Hinton, Geoffrey E., and Terrence J. Sejnowski. “Learning and releaming in Boltzmann machines.” Parallel distributed processing: Explorations in the microstructure of cognition 1 (1986): 282-317.





Spotlight on DHS: Artificial Intelligence Funding by Admin

500_F_202736113_092VnpQSvX2wf4UnyDpaDQwId7H4jnLa.jpg

The Department of Homeland Security includes roughly $90 million in its FY 2020 budget to boost the use of Artificial Intelligence at the department.

The Department of Homeland Security (DHS) recently released its FY 2020 Budget Request with a total of nearly $58 billion in top line discretionary spending being proposed. Included in this discretionary total is $7.1 billion proposed for the DHS information technology (IT) budget to address infrastructure and mission systems across its numerous agencies and directorates.

Artificial Intelligence Efforts at DHS

Increasingly, DHS and other federal agencies are looking for ways to leverage emerging technologies like Artificial Intelligence (AI), machine learning (ML) and related technological approaches to improve their mission effectiveness, stretch their workforce capacity and improve efficiencies.

Below is a summary of some key programs and efforts underway at various DHS components that are included in its FY 2020 budget:

Cybersecurity and Infrastructure Security Agency (CISA)

  • National Cybersecurity Protection System (NCPS) (a.k.a. EINSTEIN) – NCPS plans to continue to enhance analytics capabilities that leverage artificial intelligence to detect malicious activity and further automate cyber threat analysis. NCPS allocates $21.6M in FY 2020 for analytics efforts.

Transportation Security Administration (TSA)

  • Computed Tomography Algorithm Development – The FY 2020 budget includes $12.6M to continue advanced algorithm development that use machine learning to enhance TSA checkpoint threat detection effectiveness and efficiency and support automation in future security systems.

U.S. Coast Guard

  • The Operational Performance Improvements and Modeling program – Sub-project: Exploring Machine Learning (ML) for Application in Coast Guard Mission Planning and Disaster Response – The sub-project is to research using ML to improve the Coast Guard’s emergency preparedness and increase response effectiveness in active disasters. The OPIM program receives $500K for FY2020, in part to execute the AI/ML Proof of Concept and test algorithm optimization.

Science and Technology (S&T) Directorate

  • Public Safety Wireless Communications program – Sub-project: Wearable Alert and Monitoring System – The WAMS is composed of wearable devices called sensor nodes that connect to Internet of Things (IoT) sensors, as well as controller software that works with both local and remote artificial intelligence agents in the cloud to provide on demand communication and computing based on first responders’ needs. The overall PSWC program is allocated $5.7M in FY 2020.

  • Research Supporting Public Safety Broadband Implementation – Sub-project: Artificial intelligence /Artificial General Intelligence Integration – The project focuses on the integration of machine learning technologies with existing networks and associated capabilities to address public safety’s most pressing needs. Total program funding for FY 2020 is $5M spread across 4 sub-projects.

  • Flood program – Sub-project: New products from high performance computing and artificial intelligence –  Apply computer learning technologies and facial recognition algorithms to the development of a national inventory of structures database for flood-prone areas, especially for identified FEMA Special Flood Hazard Areas; SFHAs, including type of structure, elevation, tax assessment, ownership and other relevant data. Work with private sector companies to investigate the feasibility of transitioning the national structures inventory to become a commercial product that supports flood and other disaster insurance underwriting. Total program for FY 2020 is $5M spread across 7 sub-projects.

  • Data Analytics Technology Center (DA-TC) – The center provides an agile core technical service that helps DHS to adapt and leverage growing data sets and rapidly evolving technologies, including social media, live streaming, real-time analytics, machine learning and artificial intelligence. Established in FY 2016, DA-TC receives $10.4M in the FY 2020 budget.

  • Office for Interoperability and Compatibility Technology Center (OIC-TC) – The center conducts R&D to improve interoperable emergency communications. Planned FY 2020 milestones include researching edge-computing device to readily share Artificial Intelligence (AI)-infused data, e.g., body-worn camera video, physiological sensor and environmental sensor. Such device shall provide intelligent fusion of wearable sensor data and share alerts with localized user if disconnected from network as well as networked users using available connections. The center receives $2.7M in project funding for FY 2020.

  • People Screening program – The program includes an FY 2020 milestone of evaluating the technical feasibility of repurposing commercially available IoT sensors, wearable technologies, and machine learning to improve people screening operational measurement accuracy, precision, and reliability as well as officer situational awareness. This program receives $3.5M in FY 2020.

  • Next Generation First Responder program – This program plans to conduct an experiment to assess how the Assistant for Understanding Data through Reasoning, Extraction, and sYnthesis (AUDREY) artificial intelligence and data analytics capabilities can enhance paramedic decision-making and help improve patient outcomes. The program has $4.5M in current FY 2019 budget with a planned completion timeline of Q3 of FY 2019, however it is possible that funding could be extended into FY 2020 or that related work could be incorporated into related future efforts.

Countering Weapons Mass Destruction (CWMD) Office

  • The Detection Capability Development (DCD) program – sub-project: Enhanced Radiological Nuclear Inspection and Evaluation (ERNIE) – This project is an advanced machine learning (ML) approach to analyze radiation scans for improved threat detection. The total DCD program receives $33M for FY 2020, of which ERNIE receives a portion.