21 December 2019

TikTok’s national security scrutiny tightens as U.S. Navy reportedly bans popular social app


TikTok may be the fastest-growing social network in the history of the internet, but it is also quickly becoming the fastest-growing security threat and thorn in the side of U.S. China hawks.

The latest, according to a notice published by the U.S. Navy this past week and reported on by Reuters and the South China Morning Post, is that TikTok will no longer be allowed to be installed on service members’ devices, or they may face expulsion from the military service’s intranet.

It’s just the latest example of the challenges facing the extremely popular app. Recently, Congress led by Missouri senator Josh Hawley demanded a national security review of TikTok and its Sequoia-backed parent company ByteDance, along with other tech companies that may share data with foreign governments like China. Concerns over the leaking of confidential communications recently led the U.S. government to demand the unwinding of the acquisition of gay social network app Grindr from its Chinese owner Beijing Kunlun.

The intensity of criticism on both sides of the Pacific has made it increasingly challenging to manage tech companies across the divide. As I recently discussed here on TechCrunch, Shutterstock has actively made it harder and harder to find photos deemed controversial by the Chinese government on its stock photography platform, a play to avoid losing a critical source of revenue.

We saw similar challenges with Google and its Project Dragonfly China-focused search engine as well as with the NBA.

What’s interesting here though is that companies on both sides are struggling with policy on both sides. Chinese companies like ByteDance are increasingly being targeted and stricken out of the U.S. market, while American companies have long struggled to get a foothold in the Middle Kingdom. That might be a more equal playing field than it has been in the past, but it is certainly a less free market than it could be.

While the trade fight between China and the U.S. continues, the damage will continue to fall on companies that fail to draw within the lines set by policymakers in both countries. Whether any tech company can bridge that divide in the future unfortunately remains to be seen.


Read Full Article

5 Tools to Make Wikipedia Better and Discover Interesting Articles


Wikipedia is a great place to find information about any topic you’re interested in. But it can also be a great place to discover interesting topics you didn’t even know about. These tools help you discover new Wikipedia pieces and track what you want to read.

In case you didn’t already know, Wikipedia’s homepage offers a featured article every day, as well as topics in the news. If you aren’t visiting the homepage, you’re missing one of the best places to get new information every day. You should also consider the benefits of creating a Wikipedia account, so that you can track your interests and save pages for later.

1. Wiki Good Article (Twitter): Daily Random Article Worth Reading

Wiki Good Article Bot tweets a random link based on Wikipedia's six criteria for good articles

Did you know that Wikipedia has a few criteria for what makes a good article? In fact, it made a list of these “good articles” that you can read. But of course, that would be too much in one day, so follow the Wiki Good Article bot on Twitter, which tweets one random link every day.

The six factors of a good article are that it is well written, verifiable with no original research, broad in its coverage, neutral, stable, and illustrated. There are a few disqualifying factors too, but largely, these six are enough to weed out uninteresting pieces.

Importantly, an entry loses its “good article” status if it becomes one of Wikipedia’s featured articles. So this list becomes a good way to find worthwhile articles that you’d otherwise not come across easily.

2. Copernix (Web): World Map With Wikipedia Entries

Browse the map of the world with interesting wikipedia entries on Copernix

Copernix is a mixture of Google Maps and Wikipedia. It is a fascinating way to browse the map of the world and learn new things about it. Whether it’s history, geography, or current events, this is the coolest map-based experience since Google Earth.

The map is filled with pins from interesting Wikipedia articles about any area. But it isn’t based on points of interest alone. That means you don’t need a physical structure there for Copernix to place a pin. The pin is about what’s interesting from that area, whether it’s a person, an event, or anything else.

You’ll see pins like Prophet Muhammed in Saudi Arabia, the Ethiopian Airlines Flight 302 in Ethiopia, and so on. These aren’t landmarks in the physical sense, but they are landmark events in our world’s history, making it worth reading about.

At any point, you can browse a precis of all the pins in a pane on the left. Click a pin to expand its entry and read more about it. And there’s always a link to read the full Wikipedia entry. Fair warning, spend a few minutes on Copernix and you’re bound to go down the rabbit hole.

3. Weeklypedia (Web): Weekly List of Major Changes in Wikipedia

The Weeklypedia is a newsletter digest that lists articles that got the most number of changes on Wikipedia in the past week, as well as new articles and active discussions

Wikipedia is a good indicator of important happenings. When any major event takes place in the world, editors hop on to related articles about that event and start updating it. The number of changes in an article thus points to what you need to pay attention to.

These changes are available through Wikipedia’s open-source tools. Weeklypedia tracks the changes and lists the 20 most edited articles in any week, turning them into a newsletter. It’s like a digest of what’s happening around the world, dropped into your inbox. The interesting part is that the changed articles aren’t always news-related.

Along with the 20 most edited articles, Weeklypedia also tracks other activity on Wikipedia. The top five Discussions, where editors talk about hotly debated topics and what to say or not say about them, is a great place to see all sides of an argument unfolding. And the top 10 new articles created in the week is like a little news bulletin.

There is simply no reason not to subscribe to Weeklypedia. Think of it as some leisure reading, as and when you want it.

4. WikiTweaks (Chrome): Better Looking Wikipedia and History Tracking

WikiTweaks makes Wikipedia look better and tracks recently visited links to show your journey down the rabbit hole

As amazing as Wikipedia is, its design could be far better. The amount of wasted space on any page doesn’t seem optimized for reading, especially when there are tables, charts, or images.

The Chrome extension WikiTweaks makes a few cosmetic changes to Wikipedia that use space more efficiently, making a better reading experience. It’s an old extension that also adds previews if you hover over any link, but that’s not needed now that Wikipedia has made that an official feature.

WikiTweaks also tracks your Wikipedia history, which is an invaluable tool for those who have a habit of falling down the rabbit hole. Click the extension icon and you’ll see the last Wikipedia pages you visited, instantly reminding you how you landed up on the page you’re reading.

Download: WikiTweaks for Chrome (Free)

5. EpubPress (Chrome, Firefox): Create an Ebook of Multiple Wikipedia Links

Create an offline ebook of multiple Wikipedia links with EpubPress

Once you’ve got your topics, you should be able to read them anywhere, even offline. Wikipedia offers its own tool to create and download a PDF of multiple links. But currently, the Book Creator tool is undergoing changes and you can’t get these PDFs.

EpubPress is a great alternative to Wikipedia Book Creator, and much simpler to use. Install the extension in your browser, and open browse Wikipedia as you would. At any point, click the extension to see a list of all open tabs. Choose which ones you want to add to the ebook. Give it a name and a description, and download. It takes some time for the tool to finish downloading and compiling all pages, but it’s worth the wait.

The only restriction is that your final file is in ePub format, not in PDF. But that’s not a worry as most readers will support ePub. Alternately, you can always convert the ePub to PDF or any other file format with free online tools.

Download: EpubPress for Chrome | Firefox (Free)

Wikipedia Alternatives and Improvements

Wikipedia is unquestioningly the biggest user-edited encyclopedia in the world, but it isn’t the only resource you should trust. It’s in your best interest to look at alternatives, and also try to improve it as much as possible.

For starters, check out these five Wikipedia alternatives and tools for a better encyclopedia. You should especially try out Qikipedia, which gives Wikipedia previews anywhere on the web when you select some text.

Read the full article: 5 Tools to Make Wikipedia Better and Discover Interesting Articles


Read Full Article

The Worst Passwords of 2019 Have Been Revealed


The worst passwords of 2019 have been revealed, and the list shows that some people will never learn. These are the most commonly used passwords from recent data breaches. Which makes them both really common and easily broken. So, avoid at all costs.

Are Passwords Still Effective in 2019?

Passwords are slowly but surely being revealed to be a mediocre method of securing your online accounts. There have been so many data breaches now that a lot of passwords have been exposed to hackers and cybercriminals, rendering them ineffective.

Every year, SplashData, makers of several password managers, curates a list of the worst passwords of the year. The worst passwords of 2018 saw Donald Trump, Top Gun, Star Wars, and Harley Quinn inspire new (bad) passwords. So, what has happened in 2019?

The Worst Passwords of 2019, Revealed

SplashData has compiled its list of the worst passwords of 2019. There are 100 passwords listed in all, and there’s a mix of the usual suspects as well as some new entries. You can see the full list on the TeamsID website, but here are the Top 10 to give you a flavor.

  1. 123456
  2. 123456789
  3. qwerty
  4. password
  5. 1234567
  6. 12345678
  7. 12345
  8. iloveyou
  9. 111111
  10. 123123

So far, so familiar. All of these passwords appeared in the 2018 list, and most of them near the top. Using “password” as a password is particularly dumb, so it’s heartening to see that lose the top spot. However, millions of people are clearly still using it.

The idea of using a sequence of numbers is also still prevalent, and continues all the way through the Top 100. As for interesting entries, “dragon” is new in at #23, “liverpool” appears at #31, “ginger” makes it to #51, and “trustno1” comes in at #94.

Always Enable Two-Factor Authentication

This paints a depressing picture of people’s attitudes to passwords. However, it’s important to remember that these only represent a fraction of commonly used passwords. So, we can assume that most people are taking their online security more seriously.

If you’re only now realizing how bad your passwords are, there are things you can do. For one, enable two-factor authentication wherever it’s available. You should also consider using a password manager, and here are the best password managers for every occasion.

Read the full article: The Worst Passwords of 2019 Have Been Revealed


Read Full Article

Vape lung is on the decline as CDC report fixes blame on oily additive


The CDC has issued a set of reports showing that the lung disease associated with vaping seems to be declining from peak rates, and that Vitamin E acetate seems — as speculated early on — to be the prime suspect for the epidemic. The affliction has cost at least 54 lives and affected 2,506 people across the nation.

The condition now officially known as EVALI (E-cigarette, or Vaping, Product Use-Associated Lung Injury) appeared over the summer, with hundreds of people reporting chest pains, shortness of breath and other symptoms. When state medical authorities and the CDC began comparing notes, it became clear that vaping was the common theme between the cases — especially using THC products.

Before long the CDC recommended ceasing all vape product usage and was collating reports and soliciting samples from around the country. Their medical authorities have now issued several reports on the disease. The most significant finding echoes earlier indications that Vitamin E acetate, an oily substance that was apparently being used as a cutting agent in low-quality vaping cartridges, is at the very least a major contributor to the condition:

Building upon a previous study, CDC analyzed bronchoalveolar lavage (BAL) fluid from a larger number of EVALI patients from 16 states and compared them to BAL fluid from healthy people. Vitamin E acetate, also found in product samples tested by the FDA and state laboratories, was identified in BAL fluid from 48 of 51 EVALI patients and was not found in any of the BAL fluids of healthy people.

That’s pretty clear cut, but importantly it does not exonerate any other, perhaps even worse additives that may not have been so widespread. It seems clear that vaping product producers will need to reestablish trust in the wake of this fatal blunder, and part of that will have to be transparency and regulation.

Vaping rose to prominence quickly and has proven difficult to effectively regulate. The shady companies that were selling stamped-on cartridges filled with what would prove to be a lethal adulterant have probably already picked up and moved on to the next scam.

The good news is the scale of the epidemic seems to have reached its maximum. There are still cases coming in, but the number of new patients is not rising sharply every month. Perhaps this indicates that people are taking the CDC’s advice and not vaping as much or at all, or perhaps the products using the additive have been quietly slipped off the market.


Read Full Article

ALBERT: A Lite BERT for Self-Supervised Learning of Language Representations




Ever since the advent of BERT a year ago, natural language research has embraced a new paradigm, leveraging large amounts of existing text to pretrain a model’s parameters using self-supervision, with no data annotation required. So, rather than needing to train a machine-learning model for natural language processing (NLP) from scratch, one can start from a model primed with knowledge of a language. But, in order to improve upon this new approach to NLP, one must develop an understanding of what, exactly, is contributing to language-understanding performance — the network’s height (i.e., number of layers), its width (size of the hidden layer representations), the learning criteria for self-supervision, or something else entirely?

In “ALBERT: A Lite BERT for Self-supervised Learning of Language Representations”, accepted at ICLR 2020, we present an upgrade to BERT that advances the state-of-the-art performance on 12 NLP tasks, including the competitive Stanford Question Answering Dataset (SQuAD v2.0) and the SAT-style reading comprehension RACE benchmark. ALBERT is being released as an open-source implementation on top of TensorFlow, and includes a number of ready-to-use ALBERT pre-trained language representation models.

What Contributes to NLP Performance?
Identifying the dominant driver of NLP performance is complex — some settings are more important than others, and, as our study reveals, a simple, one-at-a-time exploration of these settings would not yield the correct answers.

The key to optimizing performance, captured in the design of ALBERT, is to allocate the model’s capacity more efficiently. Input-level embeddings (words, sub-tokens, etc.) need to learn context-independent representations, a representation for the word “bank”, for example. In contrast, hidden-layer embeddings need to refine that into context-dependent representations, e.g., a representation for “bank” in the context of financial transactions, and a different representation for “bank” in the context of river-flow management.

This is achieved by factorization of the embedding parametrization — the embedding matrix is split between input-level embeddings with a relatively-low dimension (e.g., 128), while the hidden-layer embeddings use higher dimensionalities (768 as in the BERT case, or more). With this step alone, ALBERT achieves an 80% reduction in the parameters of the projection block, at the expense of only a minor drop in performance — 80.3 SQuAD2.0 score, down from 80.4; or 67.9 on RACE, down from 68.2 — with all other conditions the same as for BERT.

Another critical design decision for ALBERT stems from a different observation that examines redundancy. Transformer-based neural network architectures (such as BERT, XLNet, and RoBERTa) rely on independent layers stacked on top of each other. However, we observed that the network often learned to perform similar operations at various layers, using different parameters of the network. This possible redundancy is eliminated in ALBERT by parameter-sharing across the layers, i.e., the same layer is applied on top of each other. This approach slightly diminishes the accuracy, but the more compact size is well worth the tradeoff. Parameter sharing achieves a 90% parameter reduction for the attention-feedforward block (a 70% reduction overall), which, when applied in addition to the factorization of the embedding parameterization, incur a slight performance drop of -0.3 on SQuAD2.0 to 80.0, and a larger drop of -3.9 on RACE score to 64.0.

Implementing these two design changes together yields an ALBERT-base model that has only 12M parameters, an 89% parameter reduction compared to the BERT-base model, yet still achieves respectable performance across the benchmarks considered. But this parameter-size reduction provides the opportunity to scale up the model again. Assuming that memory size allows, one can scale up the size of the hidden-layer embeddings by 10-20x. With a hidden-size of 4096, the ALBERT-xxlarge configuration achieves both an overall 30% parameter reduction compared to the BERT-large model, and, more importantly, significant performance gains: +4.2 on SQuAD2.0 (88.1, up from 83.9), and +8.5 on RACE (82.3, up from 73.8).

These results indicate that accurate language understanding depends on developing robust, high-capacity contextual representations. The context, modeled in the hidden-layer embeddings, captures the meaning of the words, which in turn drives the overall understanding, as directly measured by model performance on standard benchmarks.

Optimized Model Performance with the RACE Dataset
To evaluate the language understanding capability of a model, one can administer a reading comprehension test (e.g., similar to the SAT Reading Test). This can be done with the RACE dataset (2017), the largest publicly available resource for this purpose. Computer performance on this reading comprehension challenge mirrors well the language modeling advances of the last few years: a model pre-trained with only context-independent word representations scores poorly on this test (45.9; left-most bar), while BERT, with context-dependent language knowledge, scores relatively well with a 72.0. Refined BERT models, such as XLNet and RoBERTa, set the bar even higher, in the 82-83 score range. The ALBERT-xxlarge configuration mentioned above yields a RACE score in the same range (82.3), when trained on the base BERT dataset (Wikipedia and Books). However, when trained on the same larger dataset as XLNet and RoBERTa, it significantly outperforms all other approaches to date, and establishes a new state-of-the-art score at 89.4.
Machine performance on the RACE challenge (SAT-like reading comprehension). A random-guess baseline score is 25.0. The maximum possible score is 95.0.
The success of ALBERT demonstrates the importance of identifying the aspects of a model that give rise to powerful contextual representations. By focusing improvement efforts on these aspects of the model architecture, it is possible to greatly improve both the model efficiency and performance on a wide range of NLP tasks. To facilitate further advances in the field of NLP, we are open-sourcing ALBERT to the research community.

Coral raises $4.3M to build an at-home manicure machine


Coral is a company that wants to “simplify the personal care space through smart automation,” and they’ve raised $4.3 million to get it done. Their first goal? An at-home, fully automated machine for painting your nails. Stick a finger in, press down, wait a few seconds and you’ve got a fully painted and dried nail. More than once in our conversations, the team referred to the idea as a “Keurig coffee machine, but for nails.”

It’s still early days for the company. While they’ve got a functional machine (pictured above), they’re quite clear about it being a prototype.

As such, they’re still staying pretty hush hush about the details, declining to say much about how it actually works. They did tell me that it paints one finger at a time, taking about 10 minutes to go from bare nails to all fingers painted and dried. To speed up drying time while ensuring a durable paint job, it’ll require Coral’s proprietary nail polish — so don’t expect to be able to pop open a bottle of nail polish and pour it in. Coral’s polish will come in pods (so the Keurig comparison is particularly fitting), which the user will be able to buy individually or get via subscription. Under the hood is a camera and some proprietary computer vision algorithms, allowing the machine to paint the nail accurately without requiring manual nail cleanup from the user after the fact.

Also still under wraps — or, more accurately, not determined yet — is the price. While Coral co-founder Ramya Venkateswaran tells me that she expects it to be a “premium device,” they haven’t nailed down an exact price just yet.

While we’ve seen all sorts of nail painting machines over the years (including ones that can do all kinds of wild art, like this one we saw at CES earlier this year), Coral says its system is the only one that works without requiring the user to first prime their nails with a base coat or clear coat it after. All you need here is a bare fingernail.

Coral’s team is currently made up of eight people — mostly mechanical, chemical and software engineers. Both co-founders, meanwhile, have backgrounds in hardware; Venkateswaran previously worked as a product strategy manager at Dolby, where she helped launch the Dolby Conference Phone. Her co-founder, Bradley Leong, raised around $800,000 on Kickstarter to ship Brydge (one of the earliest takes on a laptop-style iPad keyboard) back in 2012 before becoming a partner at the seed-stage venture fund Tandem Capital. It was during some industrial hardware research there, he tells me, when he found “the innovation that this machine is based off of.”

Vankateswaran tells me the team has raised $4.3 million to date from CrossLink Capital, Root Ventures, Tandem Capital and Y Combinator. The company is part of Y Combinator’s ongoing Winter 2020 class, so I’d expect to hear more about them as this batch’s demo day approaches in March of next year.

So what’s next? They’ll be working on turning the prototype into a consumer-ready device, and plan to spend the next few months running a small beta program (which you can sign up for here.)


Read Full Article