Data Mining Courtesy of YOU!

November 24, 2014 11:22 am

Data Mining Courtesy of YOU!

Daily more and more devices are connected to the Internet.
The number of Internet connected devices reached 8.7 billion in 2012 and climbing; there are a number of online connected device estimates in the eight to ten billion range.

This number includes traditional computer devices, mobile devices, as well as the new industrial and consumer devices, acceleration- and/or heart-rate-based health devices, GPS devices. healthcare, transportation and energy industry connected devices that change the way we do things.
These devices are tiny and cheap to huge and expensive. Today, estimates are 2/3 of the devices operate autonomously while 1/3 require continuous physical connectivity.
“The Internet of Things” also reveals intimate details about the doings and goings of their owners through the sensors they contain.

A “connected device”:
measures some physical quantity, and can be connected to a general-purpose computer using some standard connector or connection technology.

Connected devices are able to communicate with each other, sometimes without the user being aware such communications are taking place.

Connected devices are what we need to measure and interface with those “devices”.

Whatever the type, consumers want every connected device to be seamlessly integrated with every other connected device.

First thing is to know what types of connected devices there actually are.
The goal for designers and developers is to work with device manufacturers and the technical community to provide a definitive, curated, cataloged source and knowledge of all known connected device systems.
The goal is to catalog information about all connected devices, actually be able to connect to the devices, get data from them and then do all sorts of things with that data.

Already in the marketplace are upwards of a few thousand devices from about 300 companies. For each device, there is a certain amount of structured information and data that can generate complete interactive reports from the data, that can be deployed on the web, on mobile or offline. Additionally, there are lots of other kinds of devices, measuring scores of different physical quantities.

Connected devices are central to a strategy of imbuing sophisticated computation and knowledge into everything; with a unique Language we now have a way to describe and compute about things in the world.

Ultimately, the data must be harvested out of a device.

There must be a good way to represent all the kinds of data that can come out of a device.
A standard protocal language provides a way to represent the data.
A person may then want to interact with the data directly on that local computer. A data buyer either wants to have something autonomous happen with the data on the local computer, to get the data into the cloud and systematically have people or machines query it, generate reports from it.

In both these cases, it’s often really convenient to have the basic device connect to some kind of small embeddable computer system. There need be mechanisms built into the device language that allow both for immediate discovery, and for communication with the cloud. The language need also be used to aggregate data from networks around the globe.

The Wolfram Project assembled the world’s most complete system for handling physical quantities and their units. The Wolfram system devices provide an immediate way to represent more than raw numbers from a device, but images, geo-positions and actual measured physical quantities.

What physical qualities?
Hundreds of thousands of physical quantities built in (like length, or torque, or tensile strength, or clicks per impression), as well as nearly 10,000 units of measure (like inches, or meters per second or katals or micropascals per square root hertz).
Now with the new language for connected devices we have a way to attach this numeric representation to real things in the world.

Then it becomes about knowing what the data means.

In fact, there are lots of things that can be done with the data that has been volunteered by the consumer.

For researchers as an example, a possible Data Repository may be created, that lets people publish data from devices;
Or a Science Platform, that lets people visualize and analyze data;
Or an opportunity for consumers to pose queries about devices, very much like the queries you can make now about consumer products.
For manufacturers, a possible database with all the purchasers name, sizes and credit card info, for utility companies and other corporations actual real time windows into the lives of any person on the planet with a connected device.

Wolfram Data Framework.

In a sense, what WDF does is take everything learned about representing data and the world and make it available to use on data from anywhere.

Their goal is to get seamless integration of as many kinds of devices as possible.
However, the more kinds of devices we have, the more interesting things are going to get.
Why? Because it means we can connect to more and more aspects of the physical world, and be in a position to compute more and more about it.

Self determination is an inalienable right for all human beings.
Personal development should not be defined by what business and government know about you.
The proliferation of the internet of things increases the risk that this will happen.

During the 36th International Privacy Conference held in Balaclava, Mauritius, four speakers representing both the private sector and academia presented the data protection and privacy Commissions with the positive changes the “internet of things” may bring to our daily lives; as well as the risks discussed the possibilities of the internet of things and its consequences.
The speakers also took stock of what needs to be done in order to ensure the continued protection of our personal data as well as our private lives.
The subsequent discussion led to the following observations and conclusions:

• Internet of things’ sensor data is high in quantity, quality and sensitivity. This means the
inferences that can be drawn are much bigger and more sensitive, and identifiable.
Considering that user identity and protection of big data already is a major challenge, it is clear that big data derived from internet of things devices makes this challenge many times larger. Therefore, such data should be regarded and treated as personal data.
• Even though for many companies the business model is as yet unknown, it is clear that the value of the internet of things is not only in the devices themselves. The money is in the new services related to the internet of things and in the data.
• Everyone who lives today will realize that connectivity is ubiquitous. This may apply even more strongly to the young and to future generations, who cannot imagine a world without being connected. It should not though solely be their concern as to whether or not their data is protected.

How Users of Connected Devices
Give Away Personal Information

People who use wearable gadgets to monitor their health or activity can be tracked with only $70 of hardware, research suggests.

A Symantec research team used a barebones Raspberry Pi computer to which they added a Bluetooth radio module to help sniff for signals. At no time did the device try to connect to any wearable. Rather, it just scooped up data being broadcast from wearable gadgets.

The researchers, Mario Barcena, Candid Wueest and Hon Lau, said “All the devices we encountered can be easily tracked using the unique hardware address they transmit.”
The work, carried out by security firm Symantec, used a Raspberry Pi computer to grab data broadcast by the gadgets.
The researchers took their Pi to busy public places in Switzerland and Ireland, including sporting events, to see what data they could grab
The sooped up Pi was able to pick out individuals in the crowds.
Some of the devices picked up were susceptible to being probed remotely making them reveal serial numbers or other identifying information.
Further, the research team looked at the applications associated with or supporting activity monitors or which use a smart phone to gather data.
About 20% of the apps Symantec looked at did nothing to conceal data being sent across the Internet, even though it contained important ID information: name, passwords and birthdate. Many apps did not do enough to secure the passage of data from users back to central servers.
In some cases it was possible to manipulate data to read information about other users or trick databases into executing commands sent by external agents.

Symantec said the eavesdropping was possible because most wearables were very simple devices that communicated with a smart phone or a laptop when passing on data they have collected.

Symantec said makers of wearables need to do a better job of protecting privacy and handling data they gather.

It would be “trivial”, said the researchers, for anyone with a modicum of computer and electronics knowledge to gather this information
There are many different end results that manufacturers of devices typically want. They say ”We’re building this great device; now we want to do things with the data from it. Anything from
analyzing it,
selling it to customers, and
doing analysis or reporting .
exposing the data
producing alerts from the data.
aggregating lots of personal data.
combining data from multiple devices to create “synthetic sensors”.

Should be possible to take data from the device and flow it into the language system, for processing. Like creating a portal or dashboard on the web, or on a mobile device, for every single user of a particular type of device.

Consumers must expect more and more connected devices are just going to end up having the computer power to run the language processing completely internally on increasingly small and ubiquitous embedded computers.

“The lack of basic security at this level is a serious omission and raises serious questions about how these services handle information stored on their servers, these are serious security lapses that could lead to a major breach of the user database”, said the Symantec team.