fbpx

Google has admitted to regulators in other jurisdictions that the average share that accrues to the company from ad revenues is roughly 30 percent

A little after the US Department of Justice issued an antitrust action against Search Engine Google's alleged monopolistic behaviour in the web search, there are indications that its advertising practices might also be looked at closely. This would have deep implications in India.

According to a LiveMint report, about 65 to 70 percent of India's digital ad market is controlled by two companies - Google and Facebook.

Google has admitted to regulators in other jurisdictions that the average share that accrues to the company from ad revenues is roughly 30 percent, this is about the percentage that the search engine was to charge as a Play Store fee that found several startups protesting the move.

“Almost every ad that appears on every free app (on the Android ecosystem) has to essentially pass via Google," said Anupam Manur, who studies platform economics at Takshashila Institution, a think tank.

“Google has a near-monopoly on apps that run on Android," Manur said to the paper.

Reports hint that DoJ investigators have been in talks with third-party ad marketers in the US at least since the beginning of this year. The UK and Australia are also looking at Google’s dominance in digital advertising.

As ads become the central driver for the internet economy, the search engine has also seen this reflected in their revenues as it remains the main source of Google's profitability.

From ads, Google has reported $116.3 billion in advertising revenue (85 percent of overall sales) last year, the report said.

According to Smriti Parsheera, a tech policy researcher with the National Institute of Public Finance and Policy (NIPFP), it is clear that Google is a dominant player in both (web) search and search advertising.

"This is an issue that has already been decided by a case before the Competition Commission of India, in which Google was fined (in 2018)," said Parasheera, adding that the question is whether it is using this dominance in anti-competitive ways.

[Source: This article was published in moneycontrol.com - Uploaded by the Association Member: Anthony Frank]

Categorized in Search Engine

Identity fraud is now more threatening than ever

Technology is changing the way people do business but, in doing so, it increases the risks around security. Identity fraud is especially on the rise. In fact, it’s estimated this type of fraud has doubled in just the last year. And, while the banking sector may be the juiciest target for attempted identity fraud, security is not purely a banking concern.

In 2015, damage caused by internet fraud amounted to $3 trillion worldwide. Latest predictions say it will be $6 trillion in 2021. This makes cyber fraud one of the biggest threats in our economy and the fastest growing crime. It is becoming far more profitable than the global trade of illegal drugs.

Enterprises all over the world need to focus on this cost-intensive problem. With over 1.9 billion websites and counting, there is a huge possibility for fraud to be committed – a serious problem that must be slowed down.

Most common identity fraud methods

Of all fraud methods, social engineering is the biggest issue for companies. It became the most common fraud method in 2019, accounting for 73% of all attempted attacks, according to our own research. It lures unsuspecting users into providing or using their confidential data and is increasingly popular with fraudsters, being efficient and difficult to recognise.

Fraudsters trick innocent people into registering for a service using their own valid ID. The account they open is then overtaken by the fraudster and used to generate value by withdrawing money or making online transfers.

They mainly look for their victims on online portals where people search for jobs, buying – and selling things, or connecting with other people. In most of the cases, the fraudsters use fake job ads, app testing offers, cheap loan offers, or fake IT support to lure their victims. People are contacted on channels like eBay Classifieds, job search engines and Facebook.

Fraudsters are also creating sophisticated architecture to boost the credibility of these cover stories which includes fake corporate email addresses, fake ads, and fake websites.

In addition, we are seeing more applicants being coached, either by messenger or video call, on what to say during the identity process. Specifically, they are instructed to say that they were not prompted to open the account by a third party but are doing so by choice.

How to fight social engineering

If organisations are to consistently stay ahead of the latest fraud methods and protect their customers, they need to have the right technology in place to be able to track fraudulent activity, react quickly and be flexible in reengineering the security system.

Crucially, it requires a mix of technical and ‘personal’ mechanisms. Some methods include:

Device binding – to make sure that only the person who can use an app – and the account behind it – is the person who is entitled to do so, the device binding feature is highly effective. From the moment a customer signs up for a service, the specific app binds with their used device (a mobile phone for example) and, as soon as another device is used, the customer needs to verify themselves again.

Psychological questions – to detect social engineering, even if it is well disguised, trained staff are an additional safety net that should be applied – and in addition to the standard checks at the start of the verification process. They ask a customer an advanced set of questions once an elevated risk of a social engineering attack is detected. These questions are constantly updated as new attack patterns emerge.

Takedown service – with every attack, organisations can learn. This means constantly checking new methods and tricks to identify websites which fraudsters are using to lure in innocent people. And, by working with an identity verification provider that has good connections to the most used web hosts and a very engaged research team, they are able to take hundreds of these websites offline.

 

Fake ID fraud

However, social engineering isn’t the only common type of identity fraud. Organisations should be aware of fake ID fraud. Our research indicates fake IDs are available on the dark web for as little as €50 and some of them are so realistic they can often fool human passport agents. The most commonly faked documents are national ID cards, followed by passports in second place. Other documents include residence permits and driving licenses.

The quality of these fake IDs is increasing too. Where in the past fraudsters used simple colour copies of ID cards, now they are switching to more advanced, and more costly falsifications that even include holographs.

Biometric security is extremely effective at fighting this kind of fraud. It can check and detect holograms and other features like optical variable inks just by moving the ID in front of the camera. Machine learning algorithms can also be used for dynamic visual detection.

Similarity fraud is another method used by fraudsters, although it’s not as common thanks to the development of easier and more efficient ways (like social engineering). This method sees a fraudster use a genuine, stolen, government-issued ID that belongs to a person with similar facial features.

To fight similarity fraud, biometric checks and liveness checks used together are very effective – and they are much more precise and accurate than a human could ever be without the help of state-of-the-art security technology.

The biometric checks scan all the characteristics in the customer’s face and compares it to the picture on their ID card or passport. If the technology confirms all of the important features in both pictures, it hands over to the liveness check. This is a liveness detection program to verify the customer’s presence. It builds a 3D model of their face by taking different angled photos while the customer moves according to instructions.

The biometric check itself could be tricked with a photo but, in combination with the liveness check, it proves there is a real person in front of the camera.

Fighting back

The threat of identity fraud is not going away and, as fraudsters become more and more sophisticated, so too must technology. With the right investment in advanced technology measures, organisations will be in a much stronger position to stop fraudsters in their tracks and protect their customers from the risk of identity fraud.

 [Source: This article was published in techradar.com By Charlie Roberts - Uploaded by the Association Member: Alex Gray]

Categorized in Investigative Research

Crowdfunding has become the de facto way to support individual ventures and philanthropic efforts. But as crowdfunding platforms have risen to prominence, they’ve also attracted malicious actors who take advantage of unsuspecting donors. Last August, a report from the Verge investigated the Dragonfly Futurefön, a decade-long fraud operation that cost victims nearly $6 million and caught the attention of the FBI. Two years ago, the U.S. Federal Trade Commission announced it was looking into a campaign for a Wi-Fi-enabled, battery-powered backpack that disappeared with more than $700,000.

GoFundMe previously said fraudulent campaigns make up less than 0.1% of all those on its platform, but with millions of new projects launching each year, many bad actors are able to avoid detection. To help catch them, researchers at the University College London, Telefonica Research, and the London School of economics devised an AI system that takes into account textual and image-based features to classify fraudulent crowdfunding behavior at the moment of publication. They claim it’s up to 90.14% accurate at distinguishing between fraudulent and legitimate crowdfunding behavior, even without any user or donation activity.

While two of the largest crowdfunding platforms on the web — GoFundMe and Kickstarter — employ forms of automation to spot potential fraud, neither claims to take the AI-driven approach advocated by the study coauthors. A spokesperson for GoFundMe told VentureBeat the company relies on the “dedicated experts” on its trust and safety team, who use technology “on par with the financial industry” and community reports to spot fraudulent campaigns. To do this, they look at things like:

  • Whether the campaign abides by the terms of service
  • Whether it provides enough information for donors
  • Whether it’s plagiarized
  • Who started the campaign
  • Who is withdrawing funds
  • Who should be receiving funds

Kickstarter says it doesn’t use AI or machine learning tools to prevent fraud, excepting proprietary automated tools, and that the majority of its investigative work is performed manually by looking at what signals surface and analyzing them to guide any action taken. A spokesperson told VentureBeat that in 2018 Kickstarter’s team suspended 354 projects and 509,487 accounts and banned 5,397 users for violating the company’s rules and guidelines — 8 times as many as it suspended in 2017.

The researchers would argue those efforts don’t go far enough. “We find that fraud is a small percentage of the crowdfunding ecosystem, but an insidious problem. It corrodes the trust ecosystem on which these platforms operate, endangering the support that thousands of people receive year on year,” they wrote. “[Crowdfunding platforms aren’t properly] incentivized to combat fraud among users and the campaigns they launch: On the one hand, a platform’s revenue is directly proportional to the number of transactions performed (since the platform charges a fixed amount per donation); on the other hand, if a platform is transparent with respect to how much fraud it has, it may discourage potential donors from participating.”

To build a corpus that could be used to “teach” the above-mentioned system to pick out fraudulent campaigns, the researchers sourced entries from GoFraudMe, a resource that aims to catalog fraudulent cases on the platform. They then created two manually annotated data sets focusing on the health domain, where the monetary and emotional stakes tend to be high. One set contained 191 campaigns from GoFundMe’s medical category, while the other contained 350 campaigns from different crowdfunding platforms (Indiegogo, GoFundMe, MightyCause, Fundrazr, and Fundly) that were directly related to organ transplants.

 

Human annotators labeled each of the roughly 700 campaigns in the corpora as “fraud” or “not-fraud” according to guidelines that included factors like evidence of contradictory information, a lack of engagement on the part of donors, and participation of the creator in other campaigns. Next, the researchers examined different textual and visual cues that might inform the system’s analysis:

  • Sentiment analysis: The team extracted the sentiments and tones expressed in campaign descriptions using IBM’s Watson natural language processing service. They computed the sentiment as a probability across five emotions (sadness, joy, fear, disgust, and anger) before analyzing confidence scores for seven possible tones (frustration, satisfaction, excitement, politeness, impoliteness, sadness, and sympathy).
  • Complexity and language choice: Operating on the assumption that fraudsters prefer simpler language and shorter sentences, the researchers checked language complexity and word choice in the campaign descriptions. They looked at both a series of readability scores and language features like function words, personal pronouns, and average syllables per word, as well as the total number of characters.
  • Form of the text: The coauthors examined the visual structure of campaign text, looking at things like whether the letters were all lowercase or all uppercase and the number of emojis in the text.
  • Word importance and named-entity recognition: The team computed word importance for the text in the campaign description, revealing similarities (and dissimilarities) among campaigns. They also identified proper nouns, numeric entities, and currencies in the text and assigned them to a finite set of categories.
  • Emotion representation: The researchers repurposed a pretrained AI model to classify campaign images as evoking one of eight emotions (amusement, anger, awe, contentment, disgust, excitement, fear, and sadness) by fine-tuning it on 23,000 emotion-labeled images from Flickr and Instagram.
  • Appearance and semantic representation: Using another AI model, the researchers extracted image appearance representations that provided a description of each image, like dominant colors, the textures of the edges of segments, and the presence of certain objects. They also used a face detector algorithm to estimate the number of faces present in each image.

After boiling many thousands of possible features down to 71 textual and 501 visual variables, the researchers used them to train a machine learning model to automatically detect fraudulent campaigns. Arriving at this ensemble model required building sub-models to classify images and text as fraudulent or not fraudulent and combining the results into a single score for each campaign.

The coauthors claim their approach revealed peculiar trends, like the fact that legitimate campaigns are more likely to have images with at least one face compared with fraudulent campaigns. On the other hand, fraudulent campaigns are generally more desperate in their appeals, in contrast with legitimate campaigns’ descriptiveness and openness about circumstances.

“In recent years, crowdfunding has emerged as a means of making personal appeals for financial support to members of the public … The community trusts that the individual who requests support, whatever the task, is doing so without malicious intent,” the researchers wrote. “However, time and again, fraudulent cases come to light, ranging from fake objectives to embezzlement. Fraudsters often fly under the radar and defraud people of what adds up to tens of millions, under the guise of crowdfunding support, enabled by small individual donations. Detecting and preventing fraud is thus an adversarial problem. Inevitably, perpetrators adapt and attempt to bypass whatever system is deployed to prevent their malicious schemes.”

It’s possible that the system might be latching onto certain features in making its predictions, exhibiting a bias that’s not obvious at first glance. That’s why the coauthors plan to improve it by taking into account sources of labeling bias and test its robustness against unlabeled medically related campaigns across crowdfunding platforms.

“This is a significant step in building a system that is preemptive (e.g., a browser plugin) as opposed to reactive,” they wrote. “We believe our method could help build trust in this ecosystem by allowing potential donors to vet campaigns before contributing.”

[Source: This article was published in venturebeat.com By Kyle Wiggers - Uploaded by the Association Member: Jeremy Frink]

Categorized in Investigative Research

With much of the country still under some form of lockdown due to COVID-19, communities are increasingly reliant upon the internet to stay connected.

The coronavirus’s ability to relegate professional, political, and personal communications to the web underscores just how important end-to-end encryption has already become for internet privacy. During this unprecedented crisis, just like in times of peace and prosperity, watering down online consumer protection is a step in the wrong direction.

The concept of end-to-end encryption is simple; platforms or services that use the system employ complex software to ensure that only the sender and the receiver can access the information being sent.

At present, many common messaging apps or video calling platforms offer end-to-end encryption, while the world’s largest social media platforms are in various stages of releasing their own form of encrypted protection.

End-to-end encryption provides consumers with the confidence that their most valuable information online will not be intercepted. In addition to personal correspondence, bank details, health records, and commercial secrets are just some of the private information entered and exchanged through encrypted connections.

With consumers unable to carry out routine business in person, such as visiting the DMV, a wealth of private data is increasingly being funneled into online transactions during the COVID-19 pandemic.

Unsurprisingly, however, the ability to communicate online in private has drawn the ire of law enforcement, who are wary of malicious actors being able to coordinate in secret. For example, earlier this year Attorney General Bill Barr called on Apple to unlock two iPhones as part of a Florida terror investigation.

The request is just the latest chapter in the Justice Department’s battle with cellphone makers to get access to private encrypted data.

While Apple has so far refused to forgo the integrity of its encryption, the push to poke loopholes into online privacy continues. The problem is not the Justice investigation, but rather the precedent it would set.

As Apple CEO Tim Cook noted in 2016, cracking encryption or installing a backdoor would effectively create a “master key.” With it, law enforcement would be able to access any number of devices.

Law enforcement agents already have a panoply of measures at their fingertips to access the private communications of suspected criminals and terrorists. From the now-infamous FISA warrants used to wiretap foreign spies to the routine subpoenas used to access historic phone records, investigators employ a variety of methods to track and prosecute criminals.

Moreover, creating a backdoor to encrypted services introduces a weak link in the system that could be exploited by countless third-party hackers. While would-be terrorists and criminals will simply shift their communications to new, yet-to-be cracked encryption services, everyday internet users will face a higher risk of having their data stolen. An effort to stop the crime that results in an opportunity for even more crime seems like a futile move.

Efforts to weaken encryption protections now appear even more misjudged due to a rise in cybercrime during the COVID-19 pandemic. Organizations such as the World Health Organization have come under cyberattack in recent weeks, with hundreds of email passwords being stolen.

Similarly, American and European officials have recently warned that hospitals and research institutions are increasingly coming under siege from hackers. According to the FBI, online crime has quadrupled since the beginning of the pandemic. In light of this cyber-crimewave, it seems that now is the time for more internet privacy protection, not less.

Internet users across America, and around the world, rely on end-to-end encryption for countless uses online. This reliance has only increased during the COVID-19 pandemic, as more consumers turn to online solutions.

Weakening internet privacy protections to fight crime might benefit law enforcement, but it would introduce new risk to law-abiding consumers.

[Source: This article was published in insidesources.com By Oliver McPherson-Smith- Uploaded by the Association Member: Jennifer Levin]

Categorized in Internet Privacy

Reverse image search is one of the most well-known and easiest digital investigative techniques, with two-click functionality of choosing “Search Google for image” in many web browsers. This method has also seen widespread use in popular culture, perhaps most notably in the MTV show Catfish, which exposes people in online relationships who use stolen photographs on their social media.

However, if you only use Google for reverse image searching, you will be disappointed more often than not. Limiting your search process to uploading a photograph in its original form to just images.google.com may give you useful results for the most obviously stolen or popular images, but for most any sophisticated research project, you need additional sites at your disposal — along with a lot of creativity.

This guide will walk through detailed strategies to use reverse image search in digital investigations, with an eye towards identifying people and locations, along with determining an image’s progeny. After detailing the core differences between the search engines, Yandex, Bing, and Google are tested on five test images showing different objects and from various regions of the world.

Beyond Google

The first and most important piece of advice on this topic cannot be stressed enough: Google reverse image search isn’t very good.

As of this guide’s publication date, the undisputed leader of reverse image search is the Russian site Yandex. After Yandex, the runners-up are Microsoft’s Bing and Google. A fourth service that could also be used in investigations is TinEye, but this site specializes in intellectual property violations and looks for exact duplicates of images.

Yandex

Yandex is by far the best reverse image search engine, with a scary-powerful ability to recognize faces, landscapes, and objects. This Russian site draws heavily upon user-generated content, such as tourist review sites (e.g. FourSquare and TripAdvisor) and social networks (e.g. dating sites), for remarkably accurate results with facial and landscape recognition queries.

Its strengths lie in photographs taken in a European or former-Soviet context. While photographs from North America, Africa, and other places may still return useful results on Yandex, you may find yourself frustrated by scrolling through results mostly from Russia, Ukraine, and eastern Europe rather than the country of your target images.

To use Yandex, go to images.yandex.com, then choose the camera icon on the right.

yandex instructions1

From there, you can either upload a saved image or type in the URL of one hosted online.

yandex instructions2 1536x70

If you get stuck with the Russian user interface, look out for Выберите файл (Choose file), Введите адрес картинки (Enter image address), and Найти (Search). After searching, look out for Похожие картинки (Similar images), and Ещё похожие (More similar).

 

The facial recognition algorithms used by Yandex are shockingly good. Not only will Yandex look for photographs that look similar to the one that has a face in it, but it will also look for other photographs of the same person (determined through matching facial similarities) with completely different lighting, background colors, and positions. While Google and Bing may just look for other photographs showing a person with similar clothes and general facial features, Yandex will search for those matches, and also other photographs of a facial match. Below, you can see how the three services searched the face of Sergey Dubinsky, a Russian suspect in the downing of MH17. Yandex found numerous photographs of Dubinsky from various sources (only two of the top results had unrelated people), with the result differing from the original image but showing the same person. Google had no luck at all, while Bing had a single result (fifth image, second row) that also showed Dubinsky.

Screenshot 4

Screenshot 5

Yandex is, obviously, a Russian service, and there are worries and suspicions of its ties (or potential future ties) to the Kremlin. While we at Bellingcat constantly use Yandex for its search capabilities, you may be a bit more paranoid than us. Use Yandex at your own risk, especially if you are also worried about using VK and other Russian services. If you aren’t particularly paranoid, try searching an un-indexed photograph of yourself or someone you know in Yandex, and see if it can find yourself or your doppelganger online.

Bing

Over the past few years, Bing has caught up to Google in its reverse image search capabilities, but is still limited. Bing’s “Visual Search”, found at images.bing.com, is very easy to use, and offers a few interesting features not found elsewhere.

bing visualsearch

Within an image search, Bing allows you to crop a photograph (button below the source image) to focus on a specific element in said photograph, as seen below. The results with the cropped image will exclude the extraneous elements, focusing on the user-defined box. However, if the selected portion of the image is small, it is worth it to manually crop the photograph yourself and increase the resolution — low-resolution images (below 200×200) bring back poor results.

Below, a Google Street View image of a man walking a couple of pugs was cropped to focus on just the pooches, leading to Bing to suggest the breed of dog visible in the photograph (the “Looks like” feature), along with visually similar results. These results mostly included pairs of dogs being walked, matching the source image, but did not always only include pugs, as French bulldogs, English bulldogs, mastiffs, and others are mixed in.

bing results cropped 1536x727

Google

By far the most popular reverse image search engine, at images.google.com, Google is fine for most rudimentary reverse image searches. Some of these relatively simple queries include identifying well-known people in photographs, finding the source of images that have been shared quite a bit online, determining the name and creator of a piece of art, and so on. However, if you want to locate images that are not close to an exact copy of the one you are researching, you may be disappointed.

For example, when searching for the face of a man who tried to attack a BBC journalist at a Trump rally, Google can find the source of the cropped image, but cannot find any additional images of him, or even someone who bears a passing resemblance to him.

trumprally

trump results google

While Google was not very strong in finding other instances of this man’s face or similar-looking people, it still found the original, un-cropped version of the photograph the screenshot was taken from, showing some utility.

Five Test Cases

For testing out different reverse image search techniques and engines, a handful of images representing different types of investigations are used, including both original photographs (not previously uploaded online) and recycled ones. Due to the fact that these photographs are included in this guide, it is likely that these test cases will not work as intended in the future, as search engines will index these photographs and integrate them into their results. Thus, screenshots of the results as they appeared when this guide was being written are included.

These test photographs include a number of different geographic regions to test the strength of search engines for source material in western Europe, eastern Europe, South America, southeast Asia, and the United States. With each of these photographs, I have also highlighted discrete objects within the image to test out the strengths and weaknesses for each search engine.

Feel free to download these photographs (every image in this guide is hyperlinked directly to a JPEG file) and run them through search engines yourself to test out your skills.

Olisov Palace In Nizhny Novgord, Russia (Original, not previously uploaded online)

test-a-1536x1134.jpg

Isolated: White SUV in Nizhny Novgorod

test-a-suv.jpg

Isolated: Trailer in Nizhny Novgorod

test-a-trailer.jpg

Cityscape In Cebu, Philippines (Original, not previously uploaded online)

test-b-1536x871.jpg

Isolated: Condominium complex, “The Padgett Place

b-toweronly.jpg

Isolated: “Waterfront Hotel

b-tower2only.jpg

Students From Bloomberg 2020 Ad (Screenshot from video)

test-c-1536x1120.jpg

Isolated: Student

c-studentonly.jpg

Av. do Café In São Paulo, Brazil (Screenshot Google Street View)

test-d-1536x691.jpg

Isolated: Toca do Açaí

d-tocadoacai.jpg

Isolated: Estacionamento (Parking)

d-estacionameno-1536x742.jpg

Amsterdam Canal (Original, not previously uploaded online)

test-e-1536x1150.jpg

Isolated: Grey Heron

test-e-bird.jpg

Isolated: Dutch Flag (also rotated 90 degrees clockwise)

test-e-flag.jpg

Results

Each of these photographs were chosen in order to demonstrate the capabilities and limitations of the three search engines. While Yandex in particular may seem like it is working digital black magic at times, it is far from infallible and can struggle with some types of searches. For some ways to possibly overcome these limitations, I’ve detailed some creative search strategies at the end of this guide.

 

Novgorod’s Olisov Palace

Predictably, Yandex had no trouble identifying this Russian building. Along with photographs from a similar angle to our source photograph, Yandex also found images from other perspectives, including 90 degrees counter-clockwise (see the first two images in the third row) from the vantage point of the source image.

a-results-yandex.jpg

Yandex also had no trouble identifying the white SUV in the foreground of the photograph as a Nissan Juke.

a-results-suv-yandex.jpg

Lastly, in the most challenging isolated search for this image, Yandex was unsuccessful in identifying the non-descript grey trailer in front of the building. A number of the results look like the one from the source image, but none are an actual match.

a-results-trailer-yandex.jpg

Bing had no success in identifying this structure. Nearly all of its results were from the United States and western Europe, showing houses with white/grey masonry or siding and brown roofs.

a-results-bings-1536x725.jpg

Likewise, Bing could not determine that the white SUV was a Nissan Juke, instead focusing on an array of other white SUVs and cars.

a-suvonly-bing-1536x728.jpg

Lastly, Bing failed in identifying the grey trailer, focusing more on RVs and larger, grey campers.

a-trailoronly-bing-1536x730.jpg

Google‘s results for the full photograph are comically bad, looking to the House television show and images with very little visual similarity.

a-results-google-1536x1213.jpg

Google successfully identified the white SUV as a Nissan Juke, even noting it in the text field search. As seen with Yandex, feeding the search engine an image from a similar perspective as popular reference materials — a side view of a car that resembles that of most advertisements — will best allow reverse image algorithms to work their magic.

a-suvonly-google.jpg

Lastly, Google recognized what the grey trailer was (travel trailer / camper), but its “visually similar images” were far from it.

a-trailoronly-google-1536x1226.jpg

Scorecard: Yandex 2/3; Bing 0/3; Google 1/3

Cebu

Yandex was technically able to identify the cityscape as that of Cebu in the Philippines, but perhaps only by accident. The fourth result in the first row and the fourth result in the second row are of Cebu, but only the second photograph shows any of the same buildings as in the source image. Many of the results were also from southeast Asia (especially Thailand, which is a popular destination for Russian tourists), noting similar architectural styles, but none are from the same perspective as the source.

b-results-yandex.jpg

Of the two buildings isolated from the search (the Padgett Palace and Waterfront Hotel), Yandex was able to identify the latter, but not the former. The Padgett Palace building is a relatively unremarkable high-rise building filled with condos, while the Waterfront Hotel also has a casino inside, leading to an array of tourist photographs showing its more distinct architecture.

 

b-tower1-yandex.jpg

b-tower2-yandex.jpg

Bing did not have any results that were even in southeast Asia when searching for the Cebu cityscape, showing a severe geographic limitation to its indexed results.

b-results-bing-1536x710.jpg

Like Yandex, Bing was unable to identify the building on the left part of the source image.

b-tower1-bing-1536x707.jpg

Bing was unable to find the Waterfront Hotel, both when using Bing’s cropping function (bringing back only low-resolution photographs) and manually cropping and increasing the resolution of the building from the source image. It is worth noting that the results from these two versions of the image, which were identical outside of the resolution, brought back dramatically different results.

b-tower2-bing-1536x498.jpg

b-tower2-bing2-1536x803.jpg

As with Yandex, Google brought back a photograph of Cebu in its results, but without a strong resemblance to the source image. While Cebu was not in the thumbnails for the initial results, following through to “Visually similar images” will fetch an image of Cebu’s skyline as the eleventh result (third image in the second row below).

b-results-google-1536x1077.jpg

As with Yandex and Bing, Google was unable to identify the high-rise condo building on the left part of the source image. Google also had no success with the Waterfront Hotel image.

b-tower1-google-1536x1366.jpg

b-tower2-google-1536x1352.jpg

Scorecard: Yandex 4/6; Bing 0/6; Google 2/6

Bloomberg 2020 Student

Yandex found the source image from this Bloomberg campaign advertisement — a Getty Images stock photo. Along with this, Yandex also found versions of the photograph with filters applied (second result, first row) and additional photographs from the same stock photo series. Also, for some reason, porn, as seen in the blurred results below.

c-results-yandex.jpg

When isolating just the face of the stock photo model, Yandex brought back a handful of other shots of the same guy (see last image in first row), plus images of the same stock photo set in the classroom (see the fourth image in the first row).

c-studentonly-results-yandex.jpg

Bing had an interesting search result: it found the exact match of the stock photograph, and then brought back “Similar images” of other men in blue shirts. The “Pages with this” tab of the result provides a handy list of duplicate versions of this same image across the web.

c-results-bing-1536x702.jpg

c-results-bing2.jpg

Focusing on just the face of the stock photo model does not bring back any useful results, or provide the source image that it was taken from.

c-studentonly-results-bing-1536x721.jpg

Google recognizes that the image used by the Bloomberg campaign is a stock photo, bringing back an exact result. Google will also provide other stock photos of people in blue shirts in class.

c-results-google.jpg

In isolating the student, Google will again return the source of the stock photo, but its visually similar images do not show the stock photo model, rather an array of other men with similar facial hair. We’ll count this as a half-win in finding the original image, but not showing any information on the specific model, as Yandex did.

 

c-studentonly-results-google.jpg

Scorecard: Yandex 6/8; Bing 1/8; Google 3.5/8

Brazilian Street View

Yandex could not figure out that this image was snapped in Brazil, instead focusing on urban landscapes in Russia.

d-results-yandex.jpg

For the parking sign [Estacionamento], Yandex did not even come close.

d-parking-yandex.jpg

Bing did not know that this street view image was taken in Brazil.

d-results-bing-1536x712.jpg

…nor did Bing recognize the parking sign

d-parking-bing-1536x705.jpg

…or the Toca do Açaí logo.

d-toco-bing-1536x498.jpg

Despite the fact that the image was directly taken from Google’s Street View, Google reverse image search did not recognize a photograph uploaded onto its own service.

d-results-google-1536x1188.jpg

Just as Bing and Yandex, Google could not recognize the Portuguese parking sign.

d-parking-google.jpg

Lastly, Google did not come close to identifying the Toca do Açaí logo, instead focusing on various types of wooden panels, showing how it focused on the backdrop of the image rather than the logo and words.

 

d-toca-google-1536x1390.jpg

Scorecard: Yandex 7/11; Bing 1/11; Google 3.5/11

Amsterdam Canal

Yandex knew exactly where this photograph was taken in Amsterdam, finding other photographs taken in central Amsterdam, and even including ones with various types of birds in the frame.

e-results-yandex.jpg

Yandex correctly identified bird in the foreground of the photograph as a grey heron (серая цапля), also bringing back an array of images of grey herons in a similar position and posture as the source image.

e-bird-yandex.jpg

However, Yandex flunked the test of identifying the Dutch flag hanging in the background of the photograph. When rotating the image 90 degrees clockwise to present the flag in its normal pattern, Yandex was able to figure out that it was a flag, but did not return any Dutch flags in its results.

e-flag-yandex.jpg

test-e-flag2.jpg

e-flag2-yandex.jpg

Bing only recognized that this image shows an urban landscape with water, with no results from Amsterdam.

e-results-bing-1536x723.jpg

Though Bing struggled with identifying an urban landscape, it correctly identified the bird as a grey heron, including a specialized “Looks like” result going to a page describing the bird.

e-bird-bing-1536x1200.jpg

However, like with Yandex, the Dutch flag was too confusing for Bing, both in its original and rotated forms.

e-flag-bing-1536x633.jpg

e-flag2-bing-1536x491.jpg

Google noted that there was a reflection in the canal of the image, but went no further than this, focusing on various paved paths in cities and nothing from Amsterdam.

e-results-google-1536x1365.jpg

Google was close in the bird identification exercise, but just barely missed it — it is a grey, not great blue, heron.

e-bird-google-1536x1378.jpg

Google was also unable to identify the Dutch flag. Though Yandex seemed to recognize that the image is a flag, Google’s algorithm focused on the windowsill framing the image and misidentified the flag as curtains.

e-flag-google-1536x1374.jpg

e-flag2-google-1536x1356.jpg

Final Scorecard: Yandex 9/14; Bing 2/14; Google 3.5/14

Creative Searching

Even with the shortcomings described in this guide, there are a handful of methods to maximize your search process and game the search algorithms.

 

Specialized Sites

For one, you could use some other, more specialized search engines outside of the three detailed in this guide. The Cornell Lab’s Merlin Bird ID app, for example, is extremely accurate in identifying the type of birds in a photograph, or giving possible options. Additionally, though it isn’t an app and doesn’t let you reverse search a photograph, FlagID.org will let you manually enter information about a flag to figure out where it comes from. For example, with the Dutch flag that even Yandex struggled with, FlagID has no problem. After choosing a horizontal tricolor flag, we put in the colors visible in the image, then receive a series of options that include the Netherlands (along with other, similar-looking flags, such as the flag of Luxembourg).

flagsearch1.jpgflagsearch2.jpg

Language Recognition

If you are looking at a foreign language with an orthography you don’t recognize, try using some OCR or Google Translate to make your life easier. You can use Google Translate’s handwriting tool to detect the language* of a letter that you hand-write, or choose a language (if you know it already) and then write it out yourself for the word. Below, the name of a cafe (“Hedgehog in the Fog“) is written out with Google Translate’s handwriting tool, giving the typed-out version of the word (Ёжик) that can be searched.

*Be warned that Google Translate is not very good at recognizing letters if you do not already know the language, though if you scroll through enough results, you can find your handwritten letter eventually.

yozhikvtumane.jpg

yozhik-1536x726.jpg

yozhik2-1536x628.jpg

Pixelation And Blurring

As detailed in a brief Twitter thread, you can pixelate or blur elements of a photograph in order to trick the search engine to focus squarely on the background. In this photograph of Rudy Giuliani’s spokeswoman, uploading the exact image will not bring back results showing where it was taken.

2019-12-16_14-55-50-1536x1036.jpg

However, if we blur out/pixelate the woman in the middle of the image, it will allow Yandex (and other search engines) to work their magic in matching up all of the other elements of the image: the chairs, paintings, chandelier, rug and wall patterns, and so on.

blurtest.jpg

After this pixelation is carried out, Yandex knows exactly where the image was taken: a popular hotel in Vienna.

yandexresult.jpg

2019-12-16_15-02-32.jpg

Conclusion

Reverse image search engines have progressed dramatically over the past decade, with no end in sight. Along with the ever-growing amount of indexed material, a number of search giants have enticed their users to sign up for image hosting services, such as Google Photos, giving these search algorithms an endless amount of material for machine learning. On top of this, facial recognition AI is entering the consumer space with products like FindClone and may already be used in some search algorithms, namely with Yandex. There are no publicly available facial recognition programs that use any Western social network, such as Facebook or Instagram, but perhaps it is only a matter of time until something like this emerges, dealing a major blow to online privacy while also (at that great cost) increasing digital research functionality.

If you skipped most of the article and are just looking for the bottom line, here are some easy-to-digest tips for reverse image searching:

  • Use Yandex first, second, and third, and then try Bing and Google if you still can’t find your desired result.
  • If you are working with source imagery that is not from a Western or former Soviet country, then you may not have much luck. These search engines are hyper-focused on these areas, and struggle for photographs taken in South America, Central America/Caribbean, Africa, and much of Asia.
  • Increase the resolution of your source image, even if it just means doubling or tripling the resolution until it’s a pixelated mess. None of these search engines can do much with an image that is under 200×200.
  • Try cropping out elements of the image, or pixelating them if it trips up your results. Most of these search engines will focus on people and their faces like a heat-seeking missile, so pixelate them to focus on the background elements.
  • If all else fails, get really creative: mirror your image horizontally, add some color filters, or use the clone tool on your image editor to fill in elements on your image that are disrupting searches.

[Source: This article was published in bellingcat.com By Aric Toler - Uploaded by the Association Member: Issac Avila] 

Categorized in Investigative Research

Annotation of a doctored image shared by Rep. Paul A. Gosar on Twitter. (Original 2011 photo of President Barack Obama with then-Indian Prime Minister Manmohan Singh by Charles Dharapak/AP)

To a trained eye, the photo shared by Rep. Paul A. Gosar (R-Ariz.) on Monday was obviously fake.

pual gosar

At a glance, nothing necessarily seems amiss. It appears to be one of a thousand (a million?) photos of a president shaking a foreign leader’s hand in front of a phalanx of flags. It’s easy to imagine that, at some point, former president Barack Obama encountered this particular official and posed for a photo.

Except that the photo at issue is of Iranian President Hassan Rouhani, someone Obama never met. Had he done so, it would have been significant news, nearly as significant as President Trump’s various meetings with North Korean leader Kim Jong Un. Casual observers would be forgiven for not knowing all of this, much less who the person standing next to Obama happened to be. Most Americans couldn’t identify the current prime minister of India in a New York Times survey; the odds they would recognize the president of Iran seem low.

Again, though, there are obvious problems with the photo that should jump out quickly. There’s that odd, smeared star on the left-most American flag (identified as A in the graphic above). There’s Rouhani’s oddly short forearm (B). And then that big blotch of color between the two presidents (C), a weird pinkish-brown blob of unexpected uniformity.

Each of those glitches reflects where the original image — a 2011 photo of Obama with then-Indian Prime Minister Manmohan Singh — was modified. The truncated star was obscured by Singh’s turban. The blotch of color is an attempt to remove the circle from the middle of the Indian flag behind the leaders. The weird forearm is a function of the slightly different postures and sizes of the Indian and Iranian leaders.

Screenshot 1

President Barack Obama meets with Indian Prime Minister Manmohan Singh in Nusa Dua, on the island of Bali, Indonesia, on Nov. 18, 2011. (Charles Dharapak/AP)

Compared with the original, the difference is obvious. What it takes, of course, is looking.

Tools exist to determine whether a photo has been altered. It’s often more art than science, involving a range of probability more than a certain final answer. The University of California at Berkeley professor Hany Farid has written a book about detecting fake images and shared quick tips with The Washington Post.

 

  • Reverse image search. Save the photo to your computer and then drop it into Google Image Search. You’ll quickly see where it might have appeared before, useful if an image purports to be over a breaking news event. Or it might show sites that have debunked it.
  • Check fact-checking sites. This can be a useful tool by itself. Images of political significance have a habit of floating around for a while, deployed for various purposes. The fake Obama-Rouhani image, for example, has been around since at least 2015 — when it appeared in a video created by a political action committee supporting Sen. Ron Johnson (R-Wis.).
  • Know what’s hard to fake. In an article for Fast Company, Farid noted that some things, like complicated physical interactions, are harder to fake than photos of people standing side by side. Backgrounds are also often tricky; it’s hard to remove something from an image while accurately re-creating what the scene behind them would have looked like. (It’s not a coincidence that both the physical interaction and background of the “Rouhani” photo were clues that it was fake.)

But, again, you have to care that you’re passing along a fake photo. Gosar didn’t. Presented with the image’s inaccuracy by a reporter from the Intercept, Gosar replied via tweet that “no one said this wasn’t photoshopped.”

“No one said the president of Iran was dead. No one said Obama met with Rouhani in person,” Gosar wrote to the “dim-witted reporter.” “The point remains to all but the dimmest: Obama coddled, appeased, nurtured and protected the worlds No. 1 sponsor of terror.”

As an argument, that may be evaluated on the merits. It is clearly the case, though, that Gosar had no qualms about sharing an edited image. He recognizes, in fact, that the photo is a lure for the point he wanted to make: Obama is bad.

That brings us to a more important point, one that demands a large-type introduction.

The Big Problem with social media

There exists a concept in social psychology called the “Dunning-Kruger effect.” You’ve probably heard of it; it’s a remarkable lens through which to consider a lot of what happens in American culture, including, specifically, politics and social media.

The idea is this: People who don’t know much about a subject necessarily don’t know how little they know. How could they? So after learning a little bit about the topic, there’s sudden confidence that arises. Now knowing more than nothing and not knowing how little of the subject they know, people can feel as though they have some expertise. And then they offer it, even while dismissing actual experts.

“Their deficits leave them with a double burden,” David Dunning wrote in 2011 about the effect, named in part after his research. “Not only does their incomplete and misguided knowledge lead them to make mistakes, but those exact same deficits also prevent them from recognizing when they are making mistakes and other people choosing more wisely.”

The effect is often depicted in a graph like this. You learn a bit and feel more confident talking about it — and that increases and increases until, in a flash, you realize that there’s a lot more to it than you thought. Call it the “oh, wait” moment. Confidence plunges, slowly rebuilding as you learn more, and learn more about what you don’t know. This affects all of us, myself included.

Screenshot 2(Philip Bump/The Washington Post)

Dunning’s effect is apparent on Twitter all the time. Here’s an example from this week, in which the “oh, wait” moment comes at the hands of an actual expert.

Screenshot 3

One value proposition for social media (and the Internet more broadly) is that this sort of Marshall-McLuhan-in-“Annie-Hall” moment can happen. People can inform themselves about reality, challenge themselves by accessing the vast scope of human knowledge and even be confronted directly by those in positions of expertise.

In reality, though, the effect of social media is often to create a chorus of people who are at a similar, overconfident point in the Dunning-Kruger curve. Another value of the Internet is in its ability to create ad hoc like-minded communities, but that also means it can convene like-minded groups of wrong-minded opinions. It’s awfully hard to feel chastened or uninformed when there is any number of other people who vocally share your view. (Why one could fill hours on a major cable-news network simply by filling panels with people on the dashed-line part of the graph above!)

The Internet facilitates ignorance as readily as it does knowledge. It allows us to build reinforcements around our errors. It allows us to share a fake image and wave away concerns because the target of the image is a shared enemy for your in-group. Or, simply, to accept a faked image as real because you’re either unaware of obvious signs of fakery or unaware of the unlikely geopolitics that surrounds its implications.

I asked Farid, the fake-photo expert, how normal people lingering at the edge of an “oh, wait” moment might avoid sharing altered images.

“Slow down!” he replied. “Understand that most fake news/images/videos are designed to be sensational or outrageous and get you to respond quickly before you’ve had time to think. When you find yourself reacting viscerally, take a breath, slow down, and don’t be so quick to share/like/retweet.”

Unless, of course, your goals are both to be sensational and to get retweets. In that case, go ahead and share the image. You can always rationalize it later.

[Source: This article was published in washingtonpost.com By Philip Bump - Uploaded by the Association Member: Alex Gray]

Categorized in Investigative Research

Since the Arab uprisings of 2011, UAE has utilised 'cyber-security governance' to quell the harbingers of revolt and suppress dissident voices

he nuts and bolts of the Emirati surveillance state moved into the spotlight on 1 February as the Abu Dhabi-based cybersecurity company DarkMatter allegedly stepped "out of the shadows" to speak to the international media.

Its CEO and founder, Faisal al-Bannai, gave a rare interview to the Associated Press at the company's headquarters in Abu Dhabi, in which he absolved his company of any direct responsibility for human rights violations in the UAE.  

 

Established in the UAE in 2015, DarkMatter has always maintained itself to be a commercially driven company. Despite the Emirati government constituting 80 percent of DarkMatter's customer base and the company previously describing itself as "a strategic partner of the UAE government", its CEO was at pains to suggest that it was independent from the state.

According to its website, the company's stated aim is to "protect governments and enterprises from the ever-evolving threat of cyber attack" by offering a range of non-offensive cybersecurity services. 

Seeking skilled hackers

Though DarkMatter defines its activities as defensive, an Italian security expert, who attended an interview with the company in 2016, likened its operations to "big brother on steroids" and suggested it was deeply rooted within the Emirati intelligence system.

Simone Margaritelli, also a former hacker, alleged that during the interview he was informed of the UAE's intention to develop a surveillance system that was "capable of intercepting, modifying, and diverting (as well as occasionally obscuring) traffic on IP, 2G, 3G, and 4G networks".

Although he was offered a lucrative monthly tax-free salary of $15,000, he rejected the offer on ethical grounds.

Furthermore, in an investigation carried out by The Intercept in 2016, sources with inside knowledge of the company said that DarkMatter was "aggressively" seeking skilled hackers to carry out offensive surveillance operations. This included plans to exploit hardware probes already installed across major cities in order to track, locate and hack any person at any time in the UAE.

In many respects, the UAE's surveillance infrastructure has been built by a network of international cybersecurity “dealers” who have willingly profited from supplying the Emirati regime with the tools needed for a modern-day surveillance state

As with other states, there is a need for cybersecurity in the UAE. As the threat of cyber-attacks has increased worldwide, there have been numerous reports of attempted attacks from external actors on critical infrastructure in the country. 

Since the Arab uprisings of 2011, however, internal "cyber-security governance", which has been utilised to quell the harbingers of revolt and suppress dissident voices, has become increasingly important to the Emirati government and other regimes across the region.

Authoritarian control

In the UAE, as with other GCC states, this has found legislative expression in the cybercrime law. Instituted in 2012, its vaguely worded provisions essentially provide a legal basis to detain anybody who criticises the regime online.

This was to be followed shortly after by the formation of the UAE’s own cybersecurity entity, the National Electronic Security Authority (NESA), which recently began working in parallel with the UAE Armed Forces’ cyber command unit, established in 2014.  

A network of Emirati government agencies and state-directed telecommunications industries have worked in loose coordination with international arms manufacturers and cybersecurity companies to transform communications technologies into central components of authoritarian control. 

In 2016, an official from the Dubai police force announced that authorities were monitoring users across 42 social media platforms, while a spokesperson for the UAE’s Telecommunication Regulatory Authority similarly boasted that all social media profiles and internet sites were being tracked by the relevant agencies.

000 OF77X

Crown Prince Mohammed Bin Zayed Al Nahyan of Abu Dhabi meets with US President Donald Trump in Washington in May 2017 (AFP)

As a result, scores of people who have criticised the UAE government on social media have been arbitrarily detained, forcefully disappeared and, in many cases, tortured.

Last year, Jordanian journalist Tayseer al-Najjar and prominent Emirati academic Nasser bin Ghaith received sentences of three and 10 years respectively for comments made on social media. Similarly, award-winning human rights activist Ahmed Mansoor has been arbitrarily detained for nearly a year due to his online activities. 

 

This has been a common theme across the region in the post-"Arab Spring" landscape. In line with this, a lucrative cybersecurity market opened up across the Middle East and North Africa, which, according to the US tech research firm Gartner, was valued at $1.3bn in 2016.

A modern-day surveillance state

In many respects, the UAE's surveillance infrastructure has been built by a network of international cybersecurity "dealers" who have willingly profited from supplying the Emirati regime with the tools needed for a modern-day surveillance state. 

Moreover, it has been reported that DarkMatter has been hiring a range of top talent from across the US national security and tech establishment, including from Google, Samsung, and McAfee. Late last year, it was revealed that DarkMatter was managing an intelligence contract that had been recruiting former CIA agents and US government officials to train Emirati security officials in a bid to bolster the UAE's intelligence body.

UK military companies also have a foothold in the Emirati surveillance state. Last year, it was revealed that BAE Systems had been using a Danish subsidiary, ETI Evident, to export surveillance technologies to the UAE government and other regimes across the region. 

'The million-dollar dissident'

Although there are officially no diplomatic relations between the two countries, in 2016, Abu Dhabi launched Falcon Eye, an Israeli-installed civil surveillance system. This enables Emirati security officials to monitor every person "from the moment they leave their doorstep to the moment they return to it", a source close to Falcon Eye told Middle East Eye in 2015.

The source added that the system allows work, social and behavioral patterns to be recorded, analyzed and archived: "It sounds like sci-fi but it is happening in Abu Dhabi today."

Moreover, in a story that made headlines in 2016, Ahmed Mansoor's iPhone was hacked by the UAE government with software provided by the Israeli-based security company NSO Group. Emirati authorities reportedly paid $1m for the software, leading international media outlets to dub Mansoor "the million-dollar dissident."

Mansoor's case is illustrative of how Emirati authorities have conducted unethical practices in the past. In recent years, the UAE has bought tailored software products from international companies such as Hacking Team to engage in isolated, targeted attacks on human rights activists, such as Mansoor.

The operations of DarkMatter, as well as the installation of Falcon Eye, suggest, however, that rather than relying on individual products from abroad, Emirati authorities are now building a surveillance system of their own and bringing operations in-house by developing the infrastructure for a 21st-century police state. 

[Source: This article was published in middleeasteye.net By JOE ODELL - Uploaded by the Association Member: Wushe Zhiyang]

Categorized in Deep Web

[Source: This article was published in halifaxtoday.ca By Ian Milligan - Uploaded by the Association Member: Deborah Tannen]

Today, and into the future, consulting archival documents increasingly means reading them on a screen

Our society’s historical record is undergoing a dramatic transformation.

Think of all the information that you create today that will be part of the record for tomorrow. More than half of the world’s population is online and maybe doing at least some of the following: communicating by email, sharing thoughts on Twitter or social media or publishing on the web.

 

Governments and institutions are no different. The American National Archives and Records Administration, responsible for American official records, “will no longer take records in paper form after December 31, 2022.

In Canada, under Library and Archives Canada’s Digital by 2017 plan, records are now preserved in the format that they were created in: that means a Word document or email will be part of our historical record as a digital object.

Traditionally, exploring archives meant largely physically collecting, searching and reviewing paper records. Today, and into the future, consulting archival documents increasingly means reading them on a screen.

This brings with it an opportunity — imagine being able to search for keywords across millions of documents, leading to radically faster search times — but also challenge, as the number of electronic documents increases exponentially.

As I’ve argued in my recent book History in the Age of Abundance, digitized sources present extraordinary opportunities as well as daunting challenges for historians. Universities will need to incorporate new approaches to how they train historians, either through historical programs or newly-emerging interdisciplinary programs in the digital humanities.

The ever-growing scale and scope of digital records suggests technical challenges: historians need new skills to plumb these for meaning, trends, voices and other currents, to piece together an understanding of what happened in the past.

There are also ethical challenges, which, although not new in the field of history, now bear particular contemporary attention and scrutiny.

Historians have long relied on librarians and archivists to bring order to information. Part of their work has involved ethical choices about what to preserve, curate, catalogue and display and how to do so. Today, many digital sources are now at our fingertips — albeit in raw, often uncatalogued, format. Historians are entering uncharted territory.

Digital abundance

Traditionally, as the late, great American historian Roy Rosenzweig of George Mason University argued, historians operated in a scarcity-based economy: we wished we had more information about the past. Today, hundreds of billions of websites preserved at the Internet Archive alone is more archival information than scholars have ever had access to. People who never before would have been included in archives are part of these collections.

Take web archiving, for example, which is the preservation of websites for future use. Since 2005, Library and Archives Canada’s web archiving program has collected over 36 terabytes of information with over 800 million items.

Even historians who study the middle ages or the 19th centuries are being affected by this dramatic transformation. They’re now frequently consulting records that began life as traditional parchment or paper but were subsequently digitized.

 

Historians’ digital literacy

Our research team at the University of Waterloo and York University, collaborating on the Archives Unleashed Project, uses sources like the GeoCities.com web archive. This is a collection of websites published by users between 1994 and 2009. We have some 186 million web pages to use, created by seven million users.

Our traditional approaches for examining historical sources simply won’t work on the scale of hundreds of millions of documents created by one website alone. We can’t read page by page nor can we simply count keywords or outsource our intellectual labor to a search engine like Google.

As historians examining these archives, we need a fundamental understanding of how records were produced, preserved and accessed. Such questions and modes of analysis are continuous with historians’ traditional training: Why were these records created? Who created or preserved them? And, what wasn’t preserved?

Second, historians who confront such voluminous data need to develop more contemporary skills to process it. Such skills can range from knowing how to take images of documents and make them searchable using Optical Character Recognition, to the ability to not only count how often given terms appear, but also what contexts they appear in and how concepts begin to appear alongside other concepts.

You might be interested in finding the “Johnson” in “Boris Johnson,” but not the “Johnson & Johnson Company.” Just searching for “Johnson” is going to get a lot of misleading results: keyword searching won’t get you there. Yet emergent research in the field of natural language processing might!

Historians need to develop basic algorithmic and data fluency. They don’t need to be programmers, but they do need to think about how code and data operates, how digital objects are stored and created and humans’ role at all stages.

Deep fake vs. history

As historical work is increasingly defined by digital records, historians can contribute to critical conversations around the role of algorithms and truth in the digital age. While both tech companies and some scholars have advanced the idea that technology and the internet will strengthen democratic participation, historical research can help uncover the impact of socio-economic power throughout communications and media history. Historians can also help amateurs parse the sea of historical information and sources now on the Web.

One of the defining skills of a historian is an understanding of historical context. Historians instinctively read documents, whether they are newspaper columns, government reports or tweets, and contextualise them in terms of not only who wrote them, but their environment, culture and time period.

As societies lose their physical paper trails and increasingly rely on digital information, historians, and their grasp of context, will become more important than ever.

As deepfakes — products of artificial intelligence that can alter images or video clips — increase in popularity online, both our media environment and our historical record will increasingly be full of misinformation.

Western societies’ traditional archives — such as those held by Library and Archives Canada or the National Archives and Records Administration — contain (and have always contained) misinformation, misrepresentation and biased worldviews, among other flaws.

Historians are specialists in critically reading documents and then seeking to confirm them. They synthesise their findings with a broad array of additional sources and voices. Historians tie together big pictures and findings, which helps us understand today’s world.

The work of a historian might look a lot different in the 21st century — exploring databases, parsing data — but the application of their fundamental skills of seeking context and accumulating knowledge will serve both society and them well in the digital age.

Categorized in Investigative Research

[Source: This article was published in techcrunch.com By Catherine Shu - Uploaded by the Association Member: Clara Johnson]

When Facebook  Graph Search launched six years ago, it was meant to help users discover content across public posts on the platform. Since then, the feature stayed relatively low-profile for many users (its last major announcement was in 2014 when a mobile version was rolled out) but became a valuable tool for many online investigators who used it to collect evidence of human rights abuses, war crimes and human trafficking. Last week, however, many of them discovered that Graph Search features had suddenly been turned off, reports Vice.

 

Graph Search let users search in plain language (i.e. sentences written the way people talk, not just keywords), but more importantly, it also let them filter search results by very specific criteria. For example, users could find who had liked a page or photo, when someone had visited a city or if they had been in the same place at the same time with another person. Despite the obvious potential for privacy issues, Graph Search was also an important resource for organizations like Bellingcat, an investigative journalism website that used it to document Saudi-led airstrikes in Yemen for its Yemen Project.

Other investigators also used Graph Search to build tools like StalkScan, but the removal of Graph Search means they have had to suspend their services or offer them in a very limited capacity. For example, StalkScan’s website now has a notice that says:

“As of June 6th, you can scan only your own profile with this tool. After two years and 28M+ StalkScan sessions, Facebook decided to make the Graph Search less transparent. As usual, they did this without any communication or dialogue with activists and journalists that used it for legitimate purposes.The creepy graph search itself still exists, but is now less accessible and more difficult to use. Make sure to check yourself with this tool, since your data is still out there!”

Facebook may be trying to take a more cautious stance because it is still dealing with the fall out from several major security lapses, including the Cambridge Analytica data scandal, as well as the revelation earlier this year that it had stored hundreds of millions of passwords in plain text.

In a statement to Vice, a Facebook spokesperson said “The vast majority of people on Facebook search using keywords, a factor which led us to pause some aspects of graph search and focus more on improving keyword search. We are working closely with researchers to make sure they have the tools they need to use our platform.” But one of Vice’s sources, a current employee at Facebook, said within the company there is “lots of internal and external struggle between giving access to info so people can find friends or research things (like Bellingcat),  and protecting it.”

TechCrunch has contacted Facebook for more information.

Categorized in Internet Search

[Source: This article was published in globalnews.ca By Jessica Vomiero - Uploaded by the Association Member: Anna K. Sasaki]

Amid the frenzy of a cross-country RCMP manhunt for two young men who’ve been charged in one murder and are suspects in another double homicide a photo of an individual who looked like one of the suspects began circulating online.

Bryer Schmegelsky, 18, and Kam McLeod, 19, have been charged with the second-degree murder of Leonard Dyck and are suspects in the double homicide of Lucas Fowler and Chyna Deese. The two men are currently on the run and police have issued nationwide warrants for their arrest.

The search has focused on northern Manitoba, where the men were believed to have been sighted on Monday. The photo was sent to police on Thursday evening by civilians following an RCMP request that anyone with any information about the whereabouts of the suspects reports it to police.

 

It depicts a young man who strikingly resembles the photos police released of McLeod holding up a copy of the Winnipeg Sun paper featuring the two suspects on the front page.

RCMP say the man in the photo is not the suspect Kam McLeod. Experts say police always have to follow up on online rumours and pictures like this.
RCMP say the man in the photo is not the suspect Kam McLeod. Experts say police always have to follow up on online rumours and pictures like this.

Police eventually determined that the photo did not depict either of the suspects.

“It appears to be an instance where a photo was taken and then ended up unintentionally circulated on social media,” RCMP Cpl. Julie Courchaine said at a press conference on Friday.

She also warned against sharing or creating rumours online.

“The spreading of false information in communities across Manitoba has created fear and panic,” she said.

While this particular photo did not show a suspect, the RCMP confirmed to Global News that their investigators follow up on “any and all tips” to determine their validity. Experts note that this mandate may force the RCMP to pull resources away from the primary investigation.

 

“They have to assign investigators to take a look at the information and then to follow up,” explained Kim Watt-Senner, who served as an RCMP officer for almost 20 years and is now a Fraser Lake, B.C., city councillor. “They physically have to send members out to try and either debunk or to corroborate that yes, this is, in fact, a bona fide lead.”

After seeing the photo, she noted that a trained eye would be able to see some distinct differences in the eyes and the facial structure, but “if a person wasn’t trained to look for certain things, I can see why the general public would think that was the suspect.”

She added that while she believes getting public input through digital channels is largely a good thing, it can also be negative.

“There’s a whole wave that happens after the information is shared on social media and the sharing of the posts and everything else, then it goes viral and it can go viral before the RCMP or the police have a chance to authenticate that information.”

While she knows through her experience as a Mountie that people are trying to help, “it can also impede the investigation, too.”

Near the beginning of the investigation, the RCMP appealed to the public for any information they had about Schmegelsky and McLeod, or the victims. Kenneth Gray, a retired FBI agent and lecturer at the University of New Haven, explained that the internet has also changed the way police respond when receiving public tips.

“Whenever you asked the public for assistance on a case and you start receiving tips, every one of those tips has to be examined to determine whether or not it is useful to solve whatever case you’re working on and that takes time,” said Gray.

“In this particular case with the photograph, it had to be examined to determine whether this was actually the suspect or whether it was just a lookalike that took vital resources that could have been being devoted to actually finding this guy.“ 

He explained that if he’d gone about verifying or debunking the photo himself, he’d attempt to determine where the information came from and trace that back to the person in the photo. He suggested performing an electronic search on the image to ultimately determine who is in the photograph.

In addition, the internet has added a new layer of complexity to screening public leads. With the advent of social media, “you get inundated with information that is coming from all over the place.”

“At one point, you would put out local information and you’d only get back local-type tips. But now, with the advent of the internet, tips can come in from all over the world. It casts such a large net that you get information from everywhere,” he said.

“That gives you a lot more noise.”

The model that most departments have pursued to deal with this, he said, is one that requires investigators to pursue all leads while setting priorities to determine which ones should be given the most resources.

While the widened reach that the internet affords can complicate things, some experts suggest that this isn’t always a negative thing.

Paul McKenna, the former director of the Ontario Provincial Police Academy and a former policing consultant for the Nova Scotia Department of Justice, agrees.

“All leads are potentially useful for the police until they are proven otherwise,” he said in a statement. “Every lead may hold something of value and police always remind the public that even the most apparently inconsequential thing may turn out to have relevance.”

Social media has played a role in a number of high-profile arrests over the years, including that of Brock Turner, who in 2016 was convicted of four counts of felony sexual assault and allegedly took photos of the naked victim and posted them on social media, and Melvin Colon, a gang member who was arrested in New York after police were given access to his online posts.

In this particular case, Watt-Senner explained that a command centre would likely be set up close to where the RCMP are stationed in Manitoba. She said that all information will be choreographed out of that command centre, where a commander will decipher the leads that come through.

“Those commanders would be tasked and trained on how to obtain information, filter information, and disseminate information and to choreograph the investigative avenues of that information in real-time,” Watt-Senner said.

She notes that the RCMP would likely have used facial recognition software to determine for certain whether the man depicted in the photo was, in fact, the suspect, and the software would also be used as citizens began to report sightings of the suspects.

“This is really integral. This is a really important part of the investigation, especially when you start to get sightings from different areas. That information would be sent to people that are specifically trained in facial recognition.”

While the investigation may take longer because of the higher volume of leads being received through digital channels, all three experts conclude that the good that comes from social media outweighs the bad.

 

 

Categorized in Investigative Research
Page 1 of 3

airs logo

Association of Internet Research Specialists is the world's leading community for the Internet Research Specialist and provide a Unified Platform that delivers, Education, Training and Certification for Online Research.

Get Exclusive Research Tips in Your Inbox

Receive Great tips via email, enter your email to Subscribe.

Follow Us on Social Media