Four short links: 11 February 2019

Four short links: 11 February 2019

Four short links
  1. Reflecting on The Soul of a New Machine (Bryan Cantrill) — re-reading the book now from start to finish has given new parts depth and meaning. Aspects that were more abstract to me as an undergraduate—from the organizational rivalries and absurdities of the industry to the complexities of West’s character and the tribulations of the team down the stretch—are now deeply evocative of concrete episodes of my own career.
  2. ExFaKTa framework for explaining facts over knowledge graphs and text. […] ExFaKT uses background knowledge encoded in the form of Horn clauses to rewrite the fact in question into a set of other easier-to-spot facts.
  3. FreedomEV — third-party Linux for your rooted Tesla.
  4. Redesigning the SystemMusic is abundant; purpose is scarce.
Article image: Four short links

Core technologies and tools for AI, big data, and cloud computing

Core technologies and tools for AI, big data, and cloud computing

Cubes

(source: Pixabay)

Profiles of IT executives suggest that many are planning to spend significantly in cloud computing and AI over the next year. This concurs with survey results we plan to release over the next few months. In a forthcoming survey, “Evolving Data Infrastructure,” we found strong interest in machine learning (ML) among respondents across geographic regions. Not only are companies interested in tools, technologies, and people who can advance the use of ML within their organizations, they are beginning to build the core foundational technologies needed to sustain their usage of analytics and ML. With that said, important challenges remain. In other surveys we ran, we found “lack of skilled people,” “lack of data,” and cultural and organizational challenges as the leading obstacles cited for holding back the adoption of machine learning and AI.

In this post, I’ll describe some of the core technologies and tools companies are beginning to evaluate and build. Many companies are just beginning to address the interplay between their suite of AI, big data, and cloud technologies. I’ll also highlight some interesting uses cases and applications of data, analytics, and machine learning. The resource examples I’ll cite will be drawn from the upcoming Strata Data conference in San Francisco, where leading companies and speakers will share their learnings on the topics covered in this post.

AI and machine learning in the enterprise

When asked what holds back the adoption of machine learning and AI, survey respondents for our upcoming report, “Evolving Data Infrastructure,” cited “company culture” and “difficulties in identifying appropriate business use cases” among the leading reasons. Attendees of the Strata Business Summit will have the opportunity to explore these issues through training sessions, tutorials, briefings, and real-world case studies from practitioners and companies. Recent improvements in tools and technologies has meant that techniques like deep learning are now being used to solve common problems, including forecasting, text mining and language understanding, and personalization. We’ve assembled sessions from leading companies, many of which will share case studies of applications of machine learning methods, including multiple presentations involving deep learning:

Foundational data technologies

Machine learning and AI require data—specifically, labeled data for training models. There are many articles that point to the explosion of data, but in order for that data that be useful for analytics and ML, it has to be collected, transported, cleaned, stored, and combined with other data sources. Thus, our surveys have shown that companies tend to apply machine learning and AI in areas where they have prior simpler use cases (business intelligence and analytics) that required data technologies to already be in place. In our upcoming report, “Evolving Data Infrastructure,” respondents indicated they are beginning to build essential components needed to sustain machine learning and AI within their organizations:

Take data lineage, an increasingly important consideration in an age when machine learning, AI, security, and privacy are critical for companies. At Strata Data San Francisco, Netflix, Intuit, and Lyft will describe internal systems designed to help users understand the evolution of available data resources. As companies ingest and use more data, there are many more users and consumers of that data within their organizations. Data lineage, data catalog, and data governance solutions can increase usage of data systems by enhancing trustworthiness of data. Moving forward, tracking data provenance is going to be important for security, compliance, and for auditing and debugging ML systems.

Companies are embracing AI and data technologies in the cloud

In the survey behind our upcoming report, “Evolving data infrastructure,” we found 85% of respondents indicated they had data infrastructure in at least one of the seven cloud providers we listed, with two-thirds (63%) using Amazon Web Services (AWS) for some portion of their data infrastructure. We found companies run a mix of open source technologies and managed services, and many respondents indicated they used more than one cloud provider.

This agrees with other surveys I’ve come across that indicated IT executives plan to invest a significant portion of their budgets in cloud computing resources and services.

Security and privacy

Regulations in Europe (GDPR) and California (Consumer Privacy Act) have placed concepts like “user control” and “privacy-by-design” at the forefront for companies wanting to deploy ML. With these new regulations in mind, the research community has stepped up and new privacy-preserving tools and techniques—including differential privacy—are becoming available for both business intelligence and ML applications. Strata Data San Francisco will feature sessions on important topics including: data security and data privacy; the use of data, analytics, and ML in (cyber)security; privacy-preserving analytics ; and secure machine learning.

Ethics

When it come to ethics, it’s fair to say the data community (and the broader technology community) is very engaged. As I noted in an earlier post, the next-generation data scientists and data engineers are undergoing training and engaging in discussions pertaining to ethics. Many universities are offering courses; some like UC Berkeley have multiple courses. We’re at the point where companies are beginning to formulate and share some best practices and processes. We are pleased to announce that we have a slate of tutorials and sessions—and a full day of presentations dedicated to ethics—at the upcoming Strata Data conference in San Francisco.

Use cases and solutions

Data, machine learning, and AI are impacting companies across industries and geographic locations. Companies are beginning to build key components including solutions that address data lineage and data governance, as well as tools that can increase the productivity of their data scientists (“data science platforms”). Many technologies and techniques are general purpose and cut across domains and industries. However, there are tools and methods that are used more heavily in certain verticals, and more importantly, we all like learning what our industry peers have been building and thinking about. Here are some related talks from a few verticals:

Article image: Cubes

(source: Pixabay).

Four short links: 8 February 2019

Four short links: 8 February 2019

Four short links
  1. BlazerExplore your data with SQL. Easily create charts and dashboards, and share them with your team.
  2. FPG-1PDP-1 FPGA implementation in Verilog, with CRT, Teletype, and Console. The PDP-1 was groundbreaking: serial number 0 was delivered to the BBN offices where Licklider would see it as a way forward to his timesharing vision. From The Dream Machine: “The PDP-1 was revolutionary,” Fredkin declares, still marveling four decades later. “Today such things don’t happen. Today a machine comes along and is slightly faster than its competitors. But here was a machine that was off the charts. Its price performance ratio was spectacularly better than anything that had come before.”
  3. ClusterFuzza scalable fuzzing infrastructure that finds security and stability issues in software. See Google’s announcement of the open-sourcing of it.
  4. Questions for a New TechnologyThey aren’t particularly subtle in their bias. They aren’t supposed to be. They also aren’t meant to be a series of boxes to be checked or hoops to be jumped through.
Article image: Four short links

Four short links: 7 February 2019

Four short links: 7 February 2019

Four short links
  1. Hamlet in Virtual Reality — context for WGBH’s Hamlet 360. It’s 360º video, so you can pick what you look at but not where you look at it from. Interesting work, and a reminder that we’re still trying to figure out what kinds of stories these media lend themselves to, and how best to tell stories with them.
  2. Self-Taught Robot Figures Out What It Looks Like and What It Can DoTo begin with, the robot had no idea what shape it was and behaved like an infant, moving randomly while attempting various tasks. Within about a day of intensive learning, the robot built up an internal picture of its structure and abilities. After 35 hours, the robot could grasp objects from specific locations and drop them in a receptacle with 100% accuracy. Paper is behind a paywall, though Sci-Hub has it.
  3. Bubble Sort: An Archaeological Algorithmic AnalysisText books, including books for general audiences, invariably mention bubble sort in discussions of elementary sorting algorithms. We trace the history of bubble sort, its popularity, and its endurance in the face of pedagogical assertions that code and algorithmic examples used in early courses should be of high quality and adhere to established best practices. This paper is more an historical analysis than a philosophical treatise for the exclusion of bubble sort from books and courses. However, sentiments for exclusion are supported by Knuth: “In short, the bubble sort seems to have nothing to recommend it, except a catchy name and the fact that it leads to some interesting theoretical problems.” Although bubble sort may not be a best practice sort, perhaps the weight of history is more than enough to compensate and provide for its longevity.
  4. Comprehensive Survey on Graph Neural NetworksWe propose a new taxonomy to divide the state-of-the-art graph neural networks into different categories. With a focus on graph convolutional networks, we review alternative architectures that have recently been developed; these learning paradigms include graph attention networks, graph autoencoders, graph generative networks, and graph spatial-temporal networks. We further discuss the applications of graph neural networks across various domains and summarize the open source codes and benchmarks of the existing algorithms on different learning tasks. Finally, we propose potential research directions in this fast-growing field.
Article image: Four short links

Roaming free: The power of reading beyond your field

Roaming free: The power of reading beyond your field

This is a keynote highlight from the O’Reilly Software Architecture Conference in New York 2019. Watch the full version of this keynote on the O’Reilly online learning platform.

You can also see other highlights from the event.

Article image: Compass

(source: Pixabay).

Design and architecture: Special Dumpster Fire Unit

Design and architecture: Special Dumpster Fire Unit

This is a keynote highlight from the O’Reilly Software Architecture Conference in New York 2019. Watch the full version of this keynote on the O’Reilly online learning platform.

You can also see other highlights from the event.

Article image: Chess cube

(source: Pixabay).

Design after Agile: How to succeed by trying less

Design after Agile: How to succeed by trying less

This is a keynote highlight from the O’Reilly Software Architecture Conference in New York 2019. Watch the full version of this keynote on the O’Reilly online learning platform.

You can also see other highlights from the event.

Four short links: 6 February 2019

Four short links: 6 February 2019

Four short links
  1. Flowbladea multitrack non-linear video editor released under GPL3 license.
  2. Automatically Assembling Textbooks from WikipediaAdamti and co have a plan for determining the utility of their approach. They plan to produce a range of Wikibooks on subjects not yet covered by human-generated books. They will then monitor the page views and edits to these books to see how popular they become and how heavily they are edited, compared with human-generated books.
  3. Amazon Knows What You Buy. And It’s Building a Big Ad Business From It (NYT) — I’m sure nothing bad can happen from this.
  4. Firefox 66 to Block Automatically Playing Audible Video and Audio (Mozilla) — user-friendly behavior ftw.
Article image: Four short links

170+ live online training courses opened for March and April

170+ live online training courses opened for March and April

Computer and other gear

(source: Skitterphoto via Pixabay)

Learn new topics and refine your skills with more than 170 new live online training courses we opened up for March and April on the O’Reilly online learning platform.

AI and machine learning

Spotlight on Innovation: Succeeding with Machine Learning with Alex Jaimes, February 13

Hands-On Adversarial Machine Learning, February 25

Probabilistic Modeling With TensorFlow Probability, February 27

Deep Learning Fundamentals, March 5

An Introduction to Amazon Machine Learning on AWS, March 6-7

Natural Language Processing (NLP) from Scratch, March 11

Deep Reinforcement Learning, March 12

Sentiment Analysis for Chatbots in Python, March 13

Hands-on Machine Learning with Python: Classification and Regression, March 13

TensorFlow Extended: Data Validation and Transform, March 14

Hands-On Machine Learning with Python: Clustering, Dimension Reduction, and Time Series Analysis, March 14

Building a Robust Machine Learning Pipeline, March 14-15

Machine Learning in Practice, March 19

TensorFlow Extended: Model Build, Analysis, and Serving, March 20

Artificial Intelligence: An Overview of AI and Machine Learning, March 20

Machine Learning for IoT , March 20

Next Generation Decision Making: Pragmatic Artificial Intelligence, March 20-21

Getting Started with Machine Learning, March 21

Artificial Intelligence for Robotics, March 21-22

Beginning Machine Learning with PyTorch, March 25

Artificial Intelligence: Real-World Applications, March 28

Active Learning, April 9

Hands On Adversarial Machine Learning, April 11

Practical Deep Learning with PyTorch, April 11-12

Blockchain

Introducing Blockchain, March 8

Building Smart Contracts on the Blockchain, March 21-22

IBM Blockchain Platform as a Service, March 25-26

Understanding Hyperledger Fabric Blockchain, March 28-29

Blockchain for Enterprise, April 1

Business

Innovative Teams, March 11

Fundamentals of Cognitive Biases, March 11

Artificial Intelligence: AI For Business, March 12

Business Strategy Fundamentals, March 13

The Power of Lean in Software Projects: Less Wasted Effort and More Product Results, March 14

Leadership Communication Skills for Managers, March 14

Emotional Intelligence in the Workplace, March 14

Thinking Like a Manager, March 14

Tools for the Digital Transformation, March 14-15

Introduction to Delegation Skills, March 21

Negotiation Fundamentals, March 22

Introduction to Critical Thinking, March 26

Your First 30 Days as a Manager, April 2

How to Give Great Presentations, April 5

Introduction to Strategic Thinking Skills, April 8

Data science and data tools

Business Data Analytics Using Python, February 27

Hands-on Introduction to Apache Hadoop and Spark Programming, March 5-6

Designing and Implementing Big Data Solutions with Azure, March 11-12

Time Series Forecasting, March 14

Cleaning Data at Scale, March 19

Practical Data Cleaning with Python, March 20-21

Building Distributed Pipelines for Data Science Using Kafka, Spark, and Cassandra , April 8-10

Real-Time Data Foundations: Kafka, April 9

Real-Time Data Foundations: Spark, April 10

Building Data APIs with GraphQL, April 11

Design and product management

From User Experience Designer to Digital Product Designer, March 1

Mastering UX Mapping, March 7-8

Writing User Stories, March 13

Product Roadmaps from the Ground Up, April 3

Programming

Design Patterns Boot Camp, February 19-20

Discovering Modern Java, March 1

Beginner’s Guide to Writing AWS Lambda Functions in Python, March 1

Building APIs with Django REST Framework, March 4

SQL for Any IT Professional, March 4

Spring Boot and Kotlin, March 5

Programming with Java Lambdas and Streams, March 5

Bootiful Testing, March 6

Learning Python 3 by Example, March 7

Getting Started with OpenShift, March 8

Setting Up Scala Projects, March 11

Getting Started with Pandas, March 11

Getting Started with Python 3, March 11-12

Java Full Throttle with Paul Deitel: A One-Day, Code-Intensive Java Standard Edition Presentation, March 12

Mastering Pandas, March 12

Scalable Concurrency with the Java Executor Framework, March 12

Getting Started with Python’s Pytest, March 13

Python Programming Fundamentals, March 13

Mastering Python’s Pytest, March 14

Kotlin Fundamentals, March 14

Quantitative Trading with Python, March 14

Advanced TDD (Test-Driven Development), March 15

Introduction to Python Programming, March 15

Bash Shell Scripting in 4 Hours, March 18

Java Testing with Mockito and the Hamcrest Matchers, March 19

Scala Core Programming: Methods, Classes Traits, March 19

Ansible in 4 Hours, March 19

Getting Started with PHP and MySQL , March 20

Mastering the Basics of Relational SQL Querying, March 20-21

Reactive Spring and Spring Boot, March 21

Automating with Ansible, March 22

Scala Core Programming: Sealed Traits, Collections, and Functions, March 25

Mastering SELinux, March 25

Intermediate Git, March 25

Scalable Programming with Java 8 Parallel Streams, March 27

Design Patterns Boot Camp, March 27-28

Mastering C# 8.0 and .NET Core 3.0, March 27-28

Rethinking REST: A Hands-On Guide to GraphQL and Queryable APIs, March 28

C# Programming: A Hands-On Guide, March 28

Web Application Programming in C# and ASP.NET Core with MVC and Entity Framework, March 28-29

Introduction to JavaScript Programming, April 2-3

Visualization in Python with Matplotlib, April 8

Python for Finance, April 8-9

Practical MQTT for the Internet of Things, April 8-9

Getting Started with Pandas, April 9

Getting Started with Python 3, April 9-10

Getting Started with React.js, April 10

What’s New In Java, April 11

Fundamentals of Rust, April 11-12

Security

CompTIA PenTest+ Crash Course, March 5-6

Start Your Security Certification Career Today, March 8

Protecting Data Privacy in a Machine Learning World, March 11-12

Certified Ethical Hacker (CEH) Crash Course, March 12-13

CompTIA Security+ SY0-501 Crash Course, March 18-19

Intense Introduction to Hacking Web Applications, March 19

Cyber Security Fundamentals, March 26-27

CISSP Crash Course, March 26-27

CISSP Certification Practice Questions and Exam Strategies, March 27

AWS Certified Security – Specialty Crash Course, March 27-28

Systems engineering and operations

Software Architecture by Example, February 21

Red Hat Certified System Administrator (RHCSA) Crash Course, March 4-7

Creating Serverless APIs with AWS Lambda and API Gateway, March 5

Amazon Web Services (AWS): Up and Running, March 6

Docker Compose, March 6

Microservice Collaboration, March 7

Docker CI/CD, March 7

OpenStack for Cloud Architects, March 7-8

Red Hat RHEL 8 New Features, March 11

From Developer to Software Architect, March 11-12

Google Cloud Certified Associate Cloud Engineer Crash Course, March 11-12

AWS Certified Solutions Architect Associate Crash Course, March 11-12

9 Steps to Awesome with Kubernetes, March 12

IP Subnetting from Beginning to Mastery, March 12-13

Istio on Kubernetes: Enter the Service Mesh, March 14

How the Internet Really Works, March 15

Kubernetes Serverless with Knative, March 15

AWS Advanced Security with Config, GuardDuty, and Macie, March 18

Software Architecture by Example, March 18

Amazon Web Services: AWS Managed Services, March 18-19

Practical Kubernetes, March 18-19

AWS Certified SysOps Administrator (Associate) Crash Course, March 18-19

CCNA Routing and Switching 200-125 Crash Course, March 18-22

Managing Containers on Linux, March 19

Docker Images, March 19

Docker: Up and Running, March 19-20

Docker Containers, March 20

Implementing Evolutionary Architectures, March 20-21

Kubernetes in 4 Hours, March 21

AWS Security Fundamentals, March 21

Deploying Container-Based Microservices on AWS, March 21-22

Google Cloud Platform (GCP) for AWS Professionals, March 22

Architecture for Continuous Delivery , March 25

Docker for JVM projects, March 25

Implementing Azure for Enterprises, March 25-26

Building and Managing Kubernetes Applications, March 26

Cloud Computing Governance, March 26

Getting Started with Amazon Web Services (AWS), March 26-27

Microservices Caching Strategies, March 27

Cloud Complexity Management, March 28

Comparing Service-Based Architectures, March 28

Network DevOps, March 29

API Driven Architecture with Swagger and API Blueprint, March 29

Software Architecture for Developers, April 1

Implementing and Troubleshooting TCP/IP, April 2

Amazon Web Services (AWS) Technical Essentials, April 2

Building Applications with Apache Cassandra, April 3-4

Introduction to Kubernetes, April 3-4

CCNA Routing and Switching Crash Course, April 4-5

Architecting Secure IoT Applications with Azure Sphere, April 4-5

AWS Design Fundamentals, April 9-10

Microservices Architecture and Design, April 9-10

Practical Docker, April 10

Automation with AWS Serverless Technologies, April 10

The future of cloud-native programming

The future of cloud-native programming

This is a keynote from the O’Reilly Software Architecture Conference in New York 2019. See other highlights from the event.

This keynote was sponsored by IBM.

Article image: Cloud

(source: Pixabay).