A data miner at the helm of Rakuten Institute of Technology, Tokyo
Rakuten is more than just a collection of services: It’s a collection of communities. One of those communities is the Rakuten Institute of Technology (RIT), founded by Masaya Mori in 2006. 13 years later, Yu Hirate is taking the helm of RIT’s Tokyo branch. Below is a conversation between RIT’s founder and the newly appointed head scientist.
Yu Hirate: Insights from analyzing data through the dimension of time
Mori: I established RIT in 2006, and you joined in 2009. You’ve really been around since the beginning stages!
When I first met you, I thought you were a very insightful man. You were a data mining specialist, analyzing data by observing it through the dimension of time. When you told me about your time-series analysis using graph data structures, I was impressed. Was this something you developed as a student?
Hirate: I can’t believe it’s been 10 years!
Yes, as a student, I was researching uses for graph analytics. At the time, my lab was trying to understand the macro structure of the internet by crawling the web pages of the entire world, and how that structure evolved. The structure of the web could be visualized as graph data, so we were analyzing how that would change over time.
I was later given the opportunity to work with a major IT company on a project to automatically detect illegal activity between users. I realized that inter-user relationships can also be visualized in the same way, in a graph data structure. I could apply the same approach I was using for the web structure analysis.
Users acting illegally tend to do it in a somewhat unnatural way. From this, I speculated that we could find them by focusing on the changes in the inter-user graph structure timeline. Turns out, my speculation was spot on and I was able to detect illegal users.
What kind of fraud did you find with this mashup of graph data and timelines?
For example, if you are attempting to sell products illegally on an online auction site, the first step would be to boost your account’s rating. That’s how you gain a user’s trust. To do that, you would cooperate with fellow fraudsters or use multiple accounts to make fake trades for each other’s products, all the while boosting each other’s ratings.
But this kind of activity creates a very specific structure, called a clique. If you can detect build-ups of cliques, you can detect many accounts that are trying to trade illegally.
This method can also be used for fraud detection on Rakuten services. Fake reviews, for example, all have the same structure. If a merchant were to use multiple accounts to review its own products and collect positive ratings, you could use a similar method to find it.
Analyzing inter-user activity is great for detecting fraud, but what else can you use it for?
All sorts of things! Graph data structures are very versatile. Not just web pages or interpersonal connections, but even the relationship between two different products can be expressed in a graph data structure. Illegal activity in particular produces specific patterns, so this approach can be used to detect many different types of fraud.
This impressed me no end when you first told me about it. Even today, I still think it’s incredibly insightful. Changing your point of view and combining it with something else, that’s how innovation is born.
A real data shock; a decade to the top
What inspired you to join Rakuten?
The notion of working with so much real-world data was a big draw. I was frustrated in university because you rarely get to work with actual data — it’s all theoretical. Joining the Institute and working with Rakuten’s real-world data was really exciting.
The first data that I worked with was Rakuten Ichiba’s search data. My task was to use the data to improve the product search engine. Since I’d never had the chance to work with such real data in university, I remember being very excited from day one!
I can feel your passion for data mining! Do you think this is a product of your love for society, for the world?
Behind the data, there is always some phenomenon to create it, be it physical, social, or otherwise. I find it fascinating to understand those phenomena by analyzing the data that comes out of them.
Fraud detection, for example, simply shines a light on fraud, a social action. However, by utilizing the data in a different way, you could use it for something positive, like improving Rakuten Ichiba’s recommendation system or the search engine autocomplete feature.
RIT is a bit of a society in and of itself, with all sorts of different people doing very different things. As the new head of the Tokyo branch, how are you going to manage it all? Will you find the time to research yourself?
Absolutely. There are so many researchers from so many different places, and everyone is driven by different philosophies. I like being in a position where I can speculate how people are thinking—I think that’s part of management.
It is true that the time I spend actually researching has decreased, but at the same time, I get to see the results of more and more interesting projects from different researchers. I feel like I’m a part of all their research, and that’s so exciting for me.
Where do you want to take Tokyo RIT from here?
Thanks to you and other researchers, RIT has established a strong reputation within Rakuten. Other departments now come to us with their technical problems. I want RIT to continue to serve as Rakuten’s technology hub.
At the same time, I want to establish RIT’s reputation as a research institute outside of Rakuten. I want belonging to RIT to hold a certain degree of status.
To do that, we need to produce more results geared towards the external research community to gain recognition. We want to encourage our researchers to write more papers and increase the number of publications, while opening the system up to guest researchers.
This interview is part two of a series on RIT researchers. Part one introduces the man behind Egison: Satoshi Egi.