Keeping You Up On The Lastest

Archive for the ‘Data’ Category

Alexander Graham Bell and His Voice

Snapshot_079

This unplayable wax recording from 1885 is now playable due to modern technology. The voice: telephone inventor Alexander Graham Bell at the Smithsonian Institution.

Researchers have identified the voice of Alexander Graham Bell for the first time in some of the earliest audio recordings held at the Smithsonian Institution.

The National Museum of American History announced Wednesday that Bell’s voice was identified with help from technicians at the Library of Congress and the Lawrence Berkeley National Laboratory in California. The museum contains some of the earliest audio recordings ever made. Researchers located a transcript of one recording signed by Bell. It was matched to a wax disc recording from April 15, 1885.“Hear my voice,” the inventor Alexander Graham Bell, said. The experimental recording also contains a series of numbers. The transcript notes the record was made at Bell’s Volta Laboratory in Washington. Additional recordings  include lines from Shakespeare.

Alexander_Graham_Bell (1)

 

Advertisements

Walmart Takes On Big Data

business suit_001

 

Much of the big data tools have been developed at the Walmart Labs, which was created after Walmart took over Kosmix in 2011. The products that were developed at Walmart Labs are ‘Social Genome’, ‘ShoppyCat and Get on the Shelf.

The Social Genome product allows Walmart to reach customers, or friends of customers, who have mentioned something online to inform them about that exact product and include a discount.
 Public data is combined from the web along with social data and proprietary data such as customer purchasing data and contact information. The result is , constantly changing, up-to-date knowledge base with hundreds of millions of entities and relationships. this provides  Walmart with a  better understanding of  the  what their customers are saying online. An example mentioned by Walmart Labs shows a woman tweeting regularly about movies. When she tweets “I love Salt”, Walmart is able to understand that she is talking about the movie Salt and not the condiment.

The Shoppycat product  developed by Walmart is able to recommend suitable products to Facebook users based on the hobbies and interests of their friends. 

Get on the Shelf  a crowd-sourcing solution that gave anyone the chance to promote his or her product in front of a large online audience. The best products would be sold at Walmart with the potential to suddenly reach millions of customers.

World Wide Web For Robots

typing4_001

 

Read more

Brown University’s Patrick Ma Digital Scholarship Lab

sitlibrary2_001

This spring,the Brown University Library will host a series of talks  to celebrate the opening of the Patrick Ma Digital Scholarship Lab at the John D. Rockefeller Jr. Library. Speakers will include Brown faculty and visiting scholars from across the academic disciplines who will discuss and use the Lab to demonstrate ways in which digital technologies have impact on their teaching and research and enable new forms of student learning and interaction

Read More

 

Petition To Unlock Cell Phones Update

Snapshot_142
The  petition asking President Obama to oppose a new rule restricting cell phone owners from unlocking their devices has passed the 100,000 signatures needed, meaning the White House now must respond.

The petition,  that now has more than 102,000 signatures, protests a regulation from the Library of Congress that prohibits unlocking phones without the carrier’s permission — even when a customer’s contract with the carrier has expired.

CTIA general counsel Michael Altschul wrote in a blog post  It “makes our streets just a little bit safer by making it harder for large-scale phone trafficking operations to operate in the open and purchase large quantities of phones, unlock them, and resell them in foreign markets”.

The petition is partly symbolic: The Library of Congress and the U.S. Copyright Office are part of the legislative branch, not the executive branch, meaning that Obama cannot overturn the decision even if he disagreed with it.

Congress has the power to rewrite the law, the 1998 Digital Millennium Copyright Act, which hands the Library of Congress the effective power to regulate certain gadgets in the name of copyright law. And a nudge from the administration would speed up any DMCA legislative fixes. Under the DMCA, Americans are broadly prohibited from “circumventing” copyright-related technologies, with criminal penalties targeting people who profit from doing it. But the DMCA gives the Library of Congress the authority to grant exemptions, which it did for cell phone unlocking utilities in 2006 and 2010.

The Library of Congress reversed their position last fall, after lobbying from CTIA, which represents carriers including AT&T, Verizon Wireless, T-Mobile, and Sprint Nextel. It ruled (PDF) the exemption was no longer necessary because there are no “adverse effects” relating to locked phones, and unlocked phones are now readily available.

The Library of Congress’ regulatory turn around doesn’t affect jail breaking or rooting mobile phones, which is currently permitted through at least 2015.

 

Petition To Unlock Mobile Phones

Snapshot_147

There’s a Petition to reverse a decision by the Library of Congress making the unlocking of mobile phones illegal that needs 15,000 more signatures by Saturday.

Over 85,000 people have signed a Whitehouse.gov petition asking President Barack Obama to reverse a decision by the Library of Congress making the unlocking of mobile phones illegal under the Digital Millennium Copyright Act (DMCA).

As of Wednesday morning, the petition, started by phone unlocking entrepreneur Sina Khanifar, still needed nearly 15,000 signatures by Saturday to trigger a response by the Obama administration.

Unlocking a phone is typically used to switch carriers. Jailbreaking a phone for the purposes of adding software unauthorized by the carrier or phone maker remains legal under the DMCA. It’s unlikely mobile carriers will seek prosecution for individual phone users, but operators of businesses that help consumers unlock their phones could face penalties of up to a $500,000 fine under the DMCA.

Khanifar said this week he’s optimistic 100,000 people will sign it by Saturday. The petition has recently won endorsements from Representative Peter DeFazio, an Oregon Democrat,

Image

DH Awards Voting 2012

sitlibrary_001

 

 

View More

Image

The Cloud

Cloud Vmware

A Data Visualization: The Internet Map

Snapshotgreenlaptop5_001

View Here

Data Becoming Bigger and Better 2013

Snapshot_139

 

 

Mortar

Infochimps

Microsoft Windows Azure HDInsight

There are  companies trying to make Hadoop more useful by turning it into a platform for something other than running MapReduce jobs. The companies – ContinuuityPlatforaDrawn to Scale

 

Differential Privacy and Big Data

Snapshot_139

Microsoft research is developing Differential Privacy technology that would serve as a privacy guard and go-between when researchers search databases. It would ensure that no individual could be re-identified, protect privacy by keeping people anonymous in databases, but still help researchers sort big data.

Differential_Privacy_for_Everyone

 

Big Data

corporate office2_001

After having been accustomed to terms like MegaByte, GigaByte, and TerraByte, we must now prepare ourselves for a whole new vocabulary, such as PetaByte, ExaByte, and ZettaByte which will be as common as the aforementioned.

Dr Riza Berkan CEO and Board Member of Hakia provides a list of  Mechanisms generating Big Data

  • Data from scientific measurements and experiments (astronomy, physics, genetics, etc.)
  • Peer to peer communication (text messaging, chat lines, digital phone calls)
  • Broadcasting (News, blogs)
  • Social Networking (Facebook, Twitter)
  • Authorship (digital books, magazines, Web pages, images, videos)
  • Administrative (enterprise or government documents, legal and financial records)
  • Business (e-commerce, stock markets, business intelligence, marketing, advertising)
  • Other

Dr Riza Berkan says Big Data can be a blessing and a curse.

He says that although there should be clear boundaries between data segments that belong to specific objectives, this very concept is misleading and can undermine potential opportunities. For example, scientists working on human genome data may improve their analysis if they could take the entire content (publications) on Medline (or Pubmed) and analyze it in conjunction with the human genome data. However, this requires natural language processing (semantic) technology combined with bioinformatics algorithms, which is an unusual coupling at best.  Two different data segments in different formats, when combined, actually define a new “big data”. Now, add to that a 3rd data segment, such as the FBI’s DNA bank, or geneology.com and you’ll see the complications/opportunities can go on and on. This is where the mystery and the excitement resides with the concept of big data.

Super Big Data Software

Dr Riza Berkan asks are we prepared for generating data at colossal volumes? and we should look at this question in two stages: (1) Platform and (2) Analytics “super” Software

Apache Hadoop’s open source software enables the distributed processing of large data sets across clusters of commodity servers, aka cloud computing. IBM’s Platform Symphony is another example of grid management suitable for a variety of distributed computing and big data analytics applications. Oracle, HP, SAP, and Software AG are very much in the game for this $10 billion industry. While these giants are offering variety of solutions for distributed computing platforms, there is still a huge void at the level of Analytics Super Software . Super Software’s main function would be to discover new knowledge which would otherwise be impossible to acquire via manual means says Dr Berkan.

Discovery requires the following functions:

  • Finding associations across information in any format
  • Visualization of associations
  • Search
  • Categorization, compacting, summarization
  • Characterization of new data (where it fits)
  • Alerting
  • Cleaning (deleting unnecessary clogging information

Moreover, Dr Berkan says that” Super Software would be able to identify genetic patterns of a disease from human genome data, supported by clinical results reported in Medline, and further analyzed to unveil mutation possibilities using FBI’s DNA bank of millions of DNA information. One can extend the scope and meaning of top level objectives which is only limited by our imagination.”

Then too, Dr Berkan says big data can also be a curse  if the cleaning (deleting) technologies are not considered as part of the Super Software operation. In his  previous post, “information pollution”, he emphasized the danger of uncontrollable growth of information which is the invisible devil in information age.

credits: Search Engine Journal/SEG

 

Amazon’s Big Data Warehousing

Amazon Redshift, a cloud-based data that it can deliver better scalability and performance than conventional on-premises data warehouses at dramatically lower costs. Promising more than a cost advantage, Amazon said its managed service approach also liberates data warehouse administrators from the tasks of monitoring, tuning, doing backups, patching software and recovering from faults. Redshift is based on relational database technology, using SQL as its query language and is compatible with existing BI tools.

 

Interactive Map Track Racist Tweets After Obama Re Election

Interactive map tracks racist reactions to Obama re-election

A group devoted to data visualization aggregated tweets of those opposed to President Obama’s re-election and mapped their locations. Those feelings spilled out online in the form of racist tweets. Floating Sheep,which maps and analyzes user-generated geo-coded data, collected the geographically identified racist tweets and distributed that data across a map of the United States. Map indicates that the highest number of racist tweets originating from Mississippi and Alabama. Montana, Idaho, Wyoming, and South Dakota is completely grayed out, indicating no racist tweets.  From previously constructed maps detailing Twitter usage, that region of the country is typified by markedly low Twitter activity.

View Map Floating Sheep Map Here

Big Data and the Legal Profession

 

Read More

IBM’s Understanding Big Data e-book

 

PDF

BYOD

Gartner says, Bring Your Own Device is an alternative strategy that allows employees, business partners and other users to use a personally selected and purchased client device to execute enterprise applications and access data. For most organizations, the program is limited to smartphones and tablets, but the strategy may also be used for PCs. It may or may not include subsidies for equipment or service fees.

Read More

Big Data and Other Technologies

 

Currently Big Data is  synonymous with technologies like Hadoop, and the “NoSQL” class of databases like Mongo (document stores) and Cassandra (key-values).  Today it’s possible to stream real-time analytics with ease. Spinning clusters up and down is a (relative) cinch, accomplished in 20 minutes or less.

Now there are new untapped open source technologies out there.

STORM AND KAFKA

Storm and Kafka are used at a number of high-profile companies including Groupon, Alibaba, and The Weather Channel.

Storm and Kafka is said to  handle data velocities of tens of thousands of messages every second.

Drill and Dremel said to  put power in the hands of business analysts, and not just data engineers.

R

R is an open source statistical programming language. It is incredibly powerful. Over two million (and counting) analysts use R. R works very well with Hadoop

GREMLIN AND GIRAPH

Gremlin and Giraph help empower graph analysis, and are often used coupled with graph databases like Neo4j or InfiniteGraph, or in the case of Giraph, working with Hadoop.

SAP HANA

SAP Hana is an in-memory analytics platform that includes an in-memory database and a suite of tools and software for creating analytical processes and moving data in and out, in the right formats.

 

Big Data and High Performance Comouting

 

Read More

New York City Show Promising Sign to Becoming The Next Silicon Valley

Tech Giants Google and Facebook have shown their presences in New York in recent years. Some big-name newcomers are headquartered here. Plans for an elite technology graduate school, attracted with city money, are getting enough attention that a federal patent officer is being stationed on campus in a first-of-its-kind arrangement.

Entrepreneurs say New York also faces particular challenges, including problematic broadband access in a few areas and a limited tech talent base, though the city is trying to address the concerns. New York solid ground so to speak in financial technology and online publishing, but the growth of social media and digital marketing opens new prospects for a city known for communications, design and advertising. Some prominent start ups  include Foursquare, Tumblr, Kickstarter and Gilt Groupe. They were established in New York in the past five years.

The city’s biggest  move was : offering 12 acres of land and up to $100 million in improvements for a tech-focused graduate school. Cornell University and Technion-Israel Institute of Technology won a competition to run the school, set to start with a handful of students in January. It will be the first institution in the country to boast about an on-campus patent officer, acting U.S. Commerce Secretary Rebecca Blank announced this month. Columbia University and New York University were also offered $15 million apiece in incentives to create new technology programs.

 

Interactive Data Visualizations

 

 

View

Upcoming Event 2012

!Microsoft Cloud OS platform: Windows Server 2012, Windows Azure, Visual Studio 2012 and System Center 2012.

More Info

Business Intelligence Applications Making Transitions

Business intelligence applications, have begun to transition from an OLAP  to a new type of service that connects different data sources from social networks, third-party apps and other sources.  NoSQL has begun to appear as a popular option for its scaling capability across cheap, commodity-based nodes. It’s much cheaper than scaling with vertically integrated systems that require attaching expensive storage arrays.

A new generation of big data applications are turning up. Which in turn has put pressure on enterprise vendors to modify existing software suites.Venture capitalists will continue to invest in data infrastructure and big data apps that represent the manifest disruption in IT.

Tag Cloud