07 June 2018

Realtime tSNE Visualizations with TensorFlow.js




In recent years, the t-distributed Stochastic Neighbor Embedding (tSNE) algorithm has become one of the most used and insightful techniques for exploratory data analysis of high-dimensional data. Used to interpret deep neural network outputs in tools such as the TensorFlow Embedding Projector and TensorBoard, a powerful feature of tSNE is that it reveals clusters of high-dimensional data points at different scales while requiring only minimal tuning of its parameters. Despite these advantages, the computational complexity of the tSNE algorithm limits its application to relatively small datasets. While several evolutions of tSNE have been developed to address this issue (mainly focusing on the scalability of the similarity computations between data points), they have so far not been enough to provide a truly interactive experience when visualizing the evolution of the tSNE embedding for large datasets.

In “Linear tSNE Optimization for the Web”, we present a novel approach to tSNE that heavily relies on modern graphics hardware. Given the linear complexity of the new approach, our method generates embeddings faster than comparable techniques and can even be executed on the client side in a web browser by leveraging GPU capabilities through WebGL. The combination of these two factors allows for real-time interactive visualization of large, high dimensional datasets. Furthermore, we are releasing this work as an open source library in the TensorFlow.js family in the hopes that the broader research community finds it useful.
Real-time evolution of the tSNE embedding for the complete MNIST dataset with our technique. The dataset contains images of 60,000 handwritten digits. You can find a live demo here.
The aim of tSNE is to cluster small “neighborhoods” of similar data points while also reducing the overall dimensionality of the data so it is more easily visualized. In other words, the tSNE objective function measures how well these neighborhoods of similar data are preserved in the 2 or 3-dimensional space, and arranges them into clusters accordingly.

In previous work, the minimization of the tSNE objective was performed as a N-body simulation problem, in which points are randomly placed in the embedding space and two different types of forces are applied on each point. Attractive forces bring the points closer to the points that are most similar in the high-dimensional space, while repulsive forces push them away from all the neighbors in the embedding.

While the attractive forces are acting on a small subset of points (i.e., similar neighbors), repulsive forces are in effect from all pairs of points. Due to this, tSNE requires significant computation and many iterations of the objective function, which limits the possible dataset size to just a few hundred data points. To improve over a brute force solution, the Barnes-Hut algorithm was used to approximate the repulsive forces and the gradient of the objective function. This allows scaling of the computation to tens of thousand data points, but it requires more than 15 minutes to compute the MNIST embedding in a C++ implementation.

In our paper, we propose a solution to this scaling problem by approximating the gradient of the objective function using textures that are generated in WebGL. Our technique draws a “repulsive field” at every minimization iteration using a three channel texture, with the 3 components treated as colors and drawn in the RGB channels. The repulsive field is obtained for every point to represent both the horizontal and vertical repulsive force created by the point, and a third component used for normalization. Intuitively, the normalization term ensures that the magnitude of the shifts matches the similarity measure in the high-dimensional space. In addition, the resolution of the texture is adaptively changed to keep the number of pixels drawn constant.
Rendering of the three functions used to approximate the repulsive effect created by a single point. In the above figure the repulsive forces show a point in a blue area is pushed to the left/bottom, while a point in the red area is pushed to the right/top while a point in the white region will not move.
The contribution of every point is then added on the GPU, resulting in a texture similar to those presented in the GIF below, that approximate the repulsive fields. This innovative repulsive field approach turns out to be much more GPU friendly than more commonly used calculation of point-to-point interactions. This is because repulsion for multiple points can be computed at once and in a very fast way in the GPU. In addition, we implemented the computation of the attraction between points in the GPU.
This animation shows the evolution of the tSNE embedding (upper left) and of the scalar fields used to approximate its gradient with normalization term (upper right), horizontal shift (bottom left) and vertical shift (bottom right).
We additionally revised the update of the embedding from an ad-hoc implementation to a series of standard tensor operations that are computed in TensorFlow.js, a JavaScript library to perform tensor computations in the web browser. Our approach, which is released as an open source library in the TensorFlow.js family, allows us to compute the evolution of the tSNE embedding entirely on the GPU while having better computational complexity.

With this implementation, what used to take 15 minutes to calculate (on the MNIST dataset) can now be visualized in real-time and in the web browser. Furthermore this allows real-time visualizations of much larger datasets, a feature that is particularly useful when deep neural output is analyzed. One main limitation of our work is that this technique currently only works for 2D embeddings. However, 2D visualizations are often preferred over 3D ones as they require more interaction to effectively understand cluster results.

Future Work
We believe that having a fast and interactive tSNE implementation that runs in the browser will empower developers of data analytics systems. We are particularly interested in exploring how our implementation can be used for the interpretation of deep neural networks. Additionally, our implementation shows how lateral thinking in using GPU computations (approximating the gradient using RGB texture) can be used to significantly speed up algorithmic computations. In the future we will be exploring how this kind of gradient approximation can be applied not only to speed-up other dimensionality reduction algorithms, but also to implement other N-body simulations in the web browser using TensorFlow.js.

Acknowledgements
We would like to thank Alexander Mordvintsev, Yannick Assogba, Matt Sharifi, Anna Vilanova, Elmar Eisemann, Nikhil Thorat, Daniel Smilkov, Martin Wattenberg, Fernanda Viegas, Alessio Bazzica, Boudewijn Lelieveldt, Thomas Höllt, Baldur van Lew, Julian Thijssen and Marvin Ritter.

Google Cloud announces the Beta of single tenant instances


One of the characteristics of cloud computing is that when you launch a virtual machine, it gets distributed wherever it makes the most sense for the cloud provider. That usually means sharing servers with other customers in what is known as a multi-tenant environment. But what about times when you want a physical server dedicated just to you?

To help meet those kinds of demands, Google announced the Beta of Google Compute Engine Sole-tenant nodes, which have been designed for use cases such a regulatory or compliance where you require full control of the underlying physical machine, and sharing is not desirable.

“Normally, VM instances run on physical hosts that may be shared by many customers. With sole-tenant nodes, you have the host all to yourself,” Google wrote in a blog post announcing the new offering.

Diagram: Google

Google has tried to be as flexible as possible, letting the customer choose exactly what configuration they want in terms CPU and memory. Customers can also let Google choose the dedicated server that’s best at any particular moment, or you can manually select the server if you want that level of control. In both cases, you will be assigned a dedicated machine.

If you want to play with this, there is a free tier and then various pricing tiers for a variety of computing requirements. Regardless of your choice, you will be charged on a per-second basis with a one-minute minimum charge, according to Google.

Since this feature is still in Beta, it’s worth noting that it is not covered under any SLA. Microsoft and Amazon have similar offerings.


Read Full Article

Women’s Safety XPRIZE $1M winner is a smart, simple panic button


Devices like smartphones ought to help people feel safer, but if you’re in real danger the last thing you want to do is pull out your phone, go to your recent contacts, and type out a message asking a friend for help. The Women’s Safety XPRIZE just awarded its $1 million prize to one of dozens of companies attempting to make a safety wearable that’s simple and affordable.

The official challenge was to create a device costing less than $40 that can “autonomously and inconspicuously trigger an emergency alert while transmitting information to a network of community responders, all within 90 seconds.”

Anu and Naveen Jain, the entrepreneurs who funded the competition, emphasized the international and very present danger of sexual assault in particular.

“Women’s safety is not just a third world problem; we face it every day in our own country and on our college campuses,” said Naveen Jain in the press release announcing the winner. “It’s not a red state problem or a blue state problem but a national problem.”

“Safety is a fundamental human right and shouldn’t be considered a luxury for women. It is the foundation in achieving gender equality,” added Anu Jain.

Out of dozens of teams that entered, five finalists were chosen in April: Artemis, Leaf Wearables, Nimb & SafeTrek, Saffron, and Soterra. All had some variation on a device that either detected or was manually activated during an attack or stressful situation, alerting friends to one’s location.

The winner was Leaf, which had the advantage of having already shipped a product along these lines, the Safer pendant. Like any other Bluetooth accessory, it keeps in touch with your smartphone wirelessly and when you press the button twice your emergency contacts are alerted to your location and need for help. It also records audio, possibly providing evidence later or a deterrent to harassers who might fear being identified.

It’s not that it’s an original idea — we’ve had various versions of this for some time, and even covered one of the other finalists last year. But they haven’t been quantitatively evaluated or given a platform like this.

“These devices were tested in many conditions by the judges to ensure that they will work in real-life cases where women face dangers today. They were tested in no-connectivity areas, on public transit, in basements of buildings, among other environments,” explained Anu Jain to TechCrunch. “Having the capability to record audio after sending the alert was one of the main differentiators for Leaf Wearables. Their chip design and software was also easy to be integrated into other accessories.”

Hopefully the million dollars and the visibility from winning the prize will help Leaf get its product out to people who need it. The runners up don’t seem likely to give up on the problem, either. And it seems like the devices will only get better and cheaper — not that this will change the world on its own.

“Prices will come down as the sensor prices drop. In many countries it will require community support to be built,” continued Jain. “These technologies can act as a deterrent but in the long term culture of violence again women must change.”


Read Full Article

SeatGeek brings ticket buying into Snapchat


You can now buy game and concert tickets from teams and musicians within Snapchat, thanks to an integration with SeatGeek.

While Snapchat has started testing e-commerce features in the past few months, SeatGeek says this is the first ticket-buying experience built into the Snapchat app.

The Los Angeles Football Club was the first team to sell tickets through this integration, by posting a Snapchat Story (and a Snapcode on the team website) that allowed users to swipe up to buy tickets to the May 26 game. The full purchase experience takes place without leaving the app.

“We’re always looking to reach our fans in innovative ways, and selling tickets directly to our followers on Snapchat gives us an incredible opportunity to connect with our most dedicated supporters,” said Los Angeles Football Club President and co-owner Tom Penn in the announcement.

SeatGeek Snapchat

SeatGeek co-founder Russ D’Souza said that as “the pipe gets solidified,” you’ll start seeing more Snapchat/SeatGeek ticket sales. He added that this the kind of integration he was hoping for when the company launched the SeatGeek Open platform a couple years ago, allowing teams, musicians and other rightsholders to sell tickets directly through SeatGeek. (The platform also supports ticket sales through Facebook.)

“For too long, the legacy ticketing approach has been to make it difficult for teams to sell tickets in lots of places,” D’Souza said. “Teams should want to sell their tickets in as many places as possible.”

And it sounds there are additional deals in the works: “What we’re excited about over the next few months is beating the drumbeat of openness with new partnerships … We want to drive the whole industry forward and create more tangible results that cause the industry to open up.”


Read Full Article

SeatGeek brings ticket buying into Snapchat


You can now buy game and concert tickets from teams and musicians within Snapchat, thanks to an integration with SeatGeek.

While Snapchat has started testing e-commerce features in the past few months, SeatGeek says this is the first ticket-buying experience built into the Snapchat app.

The Los Angeles Football Club was the first team to sell tickets through this integration, by posting a Snapchat Story (and a Snapcode on the team website) that allowed users to swipe up to buy tickets to the May 26 game. The full purchase experience takes place without leaving the app.

“We’re always looking to reach our fans in innovative ways, and selling tickets directly to our followers on Snapchat gives us an incredible opportunity to connect with our most dedicated supporters,” said Los Angeles Football Club President and co-owner Tom Penn in the announcement.

SeatGeek Snapchat

SeatGeek co-founder Russ D’Souza said that as “the pipe gets solidified,” you’ll start seeing more Snapchat/SeatGeek ticket sales. He added that this the kind of integration he was hoping for when the company launched the SeatGeek Open platform a couple years ago, allowing teams, musicians and other rightsholders to sell tickets directly through SeatGeek. (The platform also supports ticket sales through Facebook.)

“For too long, the legacy ticketing approach has been to make it difficult for teams to sell tickets in lots of places,” D’Souza said. “Teams should want to sell their tickets in as many places as possible.”

And it sounds there are additional deals in the works: “What we’re excited about over the next few months is beating the drumbeat of openness with new partnerships … We want to drive the whole industry forward and create more tangible results that cause the industry to open up.”


Read Full Article

Here’s the sequel to the surprisingly nice BlackBerry KeyOne


TCL just dropped the sequel to the KeyOne, the company’s surprisingly good keyboard-sporting BlackBerry handset. We reviewed it roughly this time last year, and it was almost enough to restore our faith in the possibilities of BlackBerry as a brand. Almost. Of course, that had much more to do with TCL’s ability to create solid hardware than any residual BB legacy.

The Key2 builds on the promise of its predecessor, bringing back the physical keyboard and familiar BlackBerry-styled design, constructed around a 4.5-inch touchscreen and aluminum frame. The phone, naturally, runs Android (8.1 to start), loaded up with your standard suite of BlackBerry software, including DTEK. The security app has been updated with an new Proactive Health feature, which offers a full system scan.

As TCL proudly notes, this is the first BlackBerry/BlackBerry-branded device to feature dual rear-facing cameras, so that’s something. The pair of 12-megapixel cameras help deliver the device into 2018 with features like Portait Mode, Optical Super Zoom and Google Lens.

There’s a chunky 3,500mAH battery and a middling Snapdragon 660, coupled with a generous 6GB of RAM and either 64- or 128GB of storage. Not too shabby, but all of that comes with a $649 price tag, which marks a $100 premium over the KeyOne, which should make this a bit of a tougher pill to swallow for what to many no doubt still feels like a bit of a novelty in the smartphone category.

The Key2 starts shipping this month, and TCL tells me that it plans to keep selling the KeyOne as well, for the time being.


Read Full Article

Google says over 8 million people use its free WiFi service at railway stations in India


Back in 2015, Google launched an initiative to bring free WiFi to India’s railway stations and today the U.S. tech giant announced that the program has passed its target of reaching 400 stations, attracting a base of eight million users in the process.

The milestone was hit today when Dibrugarh station in northeastern state Assam went online.

Google gave some insight into the scale of the program’s reach when it revealed that over eight million people use the railway-based WiFi each month. On average, the firm said, users consume 350MB in data per session with half going online via the WiFi program at least twice per day.

In another sign of scale, Google began to monetize the initiative earlier this year by offering high-speed connections for a price. The standard option includes ads to develop revenue for Google and its partners, which include Indian Railways and RailTel.

Reaching million users and over 400 stations is hugely impressive but Google said that its journey “remains unfinished.” Beyond connecting stations, the firm wants to add free WiFi to other connection points across India.

“India has the second largest population of internet users in the world, but there are still almost a billion Indians who aren’t online. There are millions of other life-changing journeys that still haven’t been taken. We realize that not everyone in India lives or works near a train station,” Caesar Sengupta, VP of Google’s Next Billion team, wrote in a blog post.

The program is also taking roots overseas. Google has already expanded it to Indonesia and Mexico and Sengupta said that it will make its way to “even more countries soon.”

Google isn’t the only tech giant pioneering a free Wi-Fi model. Facebook’s successor to Internet.org — the program that was banned in India for violating net neutrality regulationslaunched in India last year. The company hasn’t said much about it, but it isn’t likely to have anything like the same scale as Google’s.

Free Wi-Fi isn’t the only India-specific strategy from Google. The U.S. firm has launched a series of local services in India, including data-friendly versions of its top apps, a mobile payment network called Teza food delivery service and — most recently — a social network for local communities.


Read Full Article

Evernote is spinning out its Chinese business and it plans to take it public


Here’s a unique approach to Western companies doing business in China. Today, Evernote — the U.S. note-making service — span out its China-based unit into an independent entity with “full autonomy” over its business and services.

Evernote introduced its Yinxiang Biji China-based service in 2012, but now it is transitioning to a minority shareholder with the Chinese management team taking day-to-day control. As part of its move to independence, Yinxiang Biji has raised an undisclosed Series A round from the Sequoia CBC Cross-border Digital Industry Fund.

The terms are not disclosed, but Raymond Tang, CEO of Yinxiang Biji, said ownership of the business is split roughly equally between Evernote, the Chinese investors and the startup’s management team — while Yinxiang Biji itself has raised “several hundred million RMB.” (For comparison, 100 million RMB is roughly $15 million.)

Evernote and Yinxiang Biji have inked a two-year deal that will see them cross-license IP, and Tang and Evernote CMO Andrew Malcolm told TechCrunch in an interview that the duo will continue to work closely. The IP deal could also be extended, according to Malcolm, who added that the spin-out has been a move that he and Tang have discussed since they both joined Evernote in 2015.

Yinxiang Biji claims to have more than 20 million registered users who have created over one billion notes. Tang said that note creation in China per user is 50 percent higher than Evernote’s other customer base, while the business has grown at a 60-percent rate annually.

More broadly, Malcolm said the China entity accounts for some 10 percent of Evernote’s global revenue but he acknowledged that, despite adopting a local strategy since it launched, Yinxiang Biji will have the freedom to push its business harder as an independent entity. That chiefly includes building out features that apply more directly in China, such as social integrations and more.

“Even without having done some of the basics that [Chinese] users would expect, we’ve found product-market fit. How much more impactful could we be if we allowed the Chinese market team to think about their brand, technology and innovation?” he said.

The company has arguably been one of the most successful U.S. tech companies to venture into China — Linkedin, which is mired in some controversy, might be another. Yet still a change is needed since the existing approach “doesn’t satisfy what we have learned about how Chinese users want to use Evernote versus those in the rest of the world,” Malcolm summarized.

That sentiment was echoed by Eric Xu, partner of the Sequoia fund.

“I am convinced that Yinxiang Biji will further unleash its potential and pick up development after the spin-off, as technical and decision-making autonomy is gained and fully localized operations are on the way. Moreover, its business model is a frame of reference for future cross-border Internet partnerships,” Xu said in a supplied statement.

Beyond impact on the service, there is a major business reason, too. The move frees Yinxiang Biji up for a potential listing, which Malcolm and Tang both acknowledged is part of the plan since Chinese financial regulations are strict, including clauses such as two years of profitability. Part of that planned approach includes the new management structure, which makes Yinxiang Biji majority-Chinese owned thus satisfying another regulatory requirement.

“We are very much aware of how far ahead you need to be thinking” in order to go public in China, Malcolm said. “It’s top of our minds when we speak.”

Tang, meanwhile, suggested that the company might look to tap exchanges in Shanghai or Shenzhen, but there’s no immediate timeframe for that at this point. Both executives pointed out that the Chinese market requires a unique approach and, in this case for certain, Evernote is adopting one.

Evernote, once valued at over $1 billion, has been in a period of transition over the last few years after the exit of co-founder and CEO Phil Libin in the summer of 2015. A slew of over executives followed Libin, a ‘changing of the guard’ as perhaps might be expected when a founding member departs. Since then, the company has quietly solidified its business in the years since then under the helm of CEO Chris O’Neill, who previously spent a decade with Google.

Under that context, the Chinese move makes plenty of sense since it happened under the previous Evernote management regime, but it also raises questions about Evernote’s own immediate future, and a potential IPO. The company isn’t saying anything on that now, but it would be quite something if the business unit it set up in China went public before the mothership.


Read Full Article

Photos on social media can predict the health of neighborhoods


The images that appear on social media – happy people eating, cultural happenings, and smiling dogs – can actually predict the likelihood that a neighborhood is “healthy” as well as its level of gentrification.

From the report:

So says a groundbreaking study published in Frontiers in Physics, in which researchers used social media images of cultural events in London and New York City to create a model that can predict neighborhoods where residents enjoy a high level of wellbeing — and even anticipate gentrification by 5 years. With more than half of the world’s population living in cities, the model could help policymakers ensure human wellbeing in dense urban settings.

The idea is based on the concept of “cultural capital” – the more there is, the better the neighborhood becomes. For example, if there are many pictures of fun events in a certain spot you can expect a higher level of well-being in that area’s denizens. The research also suggests that investing in arts and culture will actively improve a neighborhood.

“Culture has many benefits to an individual: it opens our minds to new emotional experiences and enriches our lives,” said Dr. Daniele Quercia. “We’ve known for decades that this ‘cultural capital’ plays a huge role in a person’s success. Our new model shows the same correlation for neighborhoods and cities, with those neighborhoods experiencing the greatest growth having high cultural capital. So, for every city or school district debating whether to invest in arts programs or technology centers, the answer should be a resounding ‘Yes!'”

The Cambridge-based team looked at “millions of Flickr images” taken at cultural events in New York and London and overlaid them on maps of these cities. The findings, as we can imagine, were obvious.

“We were able to see that the presence of culture is directly tied to the growth of certain neighborhoods, rising home values and median income. Our model can even predict gentrification within five years,” said Quercia. “This could help city planners and councils think through interventions to prevent people from being displaced as a result of gentrification.”

The team expects to be able to assess the health of citizens using the same method, overlaying pictures of food on maps in order to find food deserts and spots where cafes and croissants are on the rise. Just imagine: all those Instagrammed photos of your favorite sandwiches will some day help researchers build happier cities.


Read Full Article

A friendly reminder: Don’t put passwords in Trello


A new bit of research from David Shear at security firm Flashpoint found that there are hundreds if not thousands of open Trello boards containing passwords, login credentials, and other potentially sensitive stuff including employee on-boarding documents. He and Brian Krebs reported the boards to Trello although some folks have already been notified by well-meaning hackers who wrote “Change your password” on some of these public boards.

“One particularly jarring misstep came from someone working for Seceon, a Westford, Mass. cybersecurity firm that touts the ability to detect and stop data breaches in real time,” wrote Krebs. “But until a few weeks ago the Trello page for Seceon featured multiple usernames and passwords, including credentials to log in to the company’s WordPress blog and iPage domain hosting.”

Another Trello board made at Red Hat in 2017 offered passwords to a pair of online test servers.

Trello worked with the pair to take down the public boards they found and is working with Google to remove the cached sites.

“We have put many safeguards in place to make sure that public boards are being created intentionally and have clear language around each privacy setting, as well as persistent visibility settings at the top of each board,” said a Trello spokesperson.

Missteps like these are sadly common. Another rich trove of user data, Github, has been used to find private passwords for years. Anecdotally, a project I was working on suffered a breach when the CTO put a Bitcoin private key into some public Github code. Yeah. Exactly.

So, again, keep your Trello boards private, don’t paste passwords willy-nilly, and maintain at least a basic level of operational security by not pasting passwords into any site that could make it public. It’s hard but definitely worth the effort.


Read Full Article