Learning Effective Embeddings for Machine Generated Emails with Applications to Email Category Prediction
Autor: | Marc Najork, Yu Sun, Andrei Z. Broder, James B. Wendt, Lluis Garcia-Pueyo |
---|---|
Rok vydání: | 2018 |
Předmět: |
World Wide Web
Sequence Computer science 0202 electrical engineering electronic engineering information engineering Key (cryptography) Embedding 020201 artificial intelligence & image processing Transient (computer programming) 02 engineering and technology 010501 environmental sciences Representation (mathematics) 01 natural sciences 0105 earth and related environmental sciences |
Zdroj: | IEEE BigData |
Popis: | Machine generated business-to-consumer (B2C) emails such as receipts, newsletters, and promotions constitute a large portion of users’ inboxes today. These emails reflect the users’ interests and often are sequentially correlated, e.g., users interested in relocating may receive a sequence of messages on housing, moving, job availability, etc. We aim to infer (and eventually serve) the users’ future interests by predicting the categories of their future emails. There are many useful methods, such as recurrent neural networks, that can be applied for such predictions, but in all cases the key to better performance is an effective representation of emails and users. To this end, we propose a general framework for learning embeddings for emails and users, using as input only the sequence of B2C templates users receive and open. (A template is a B2C email stripped of all transient information related to specific users.) These learned embeddings allow us to identify both sequentially correlated emails and users with similar sequential interests. We can also use the learned embeddings either as input features or embedding initializers for email category prediction tasks. Extensive experiments with millions of fully anonymized B2C emails demonstrate that the learned embeddings can significantly improve the prediction accuracy for future email categories. We hope that this effective yet simple embedding learning framework will inspire new machine intelligence applications that will improve the users’ email experience. |
Databáze: | OpenAIRE |
Externí odkaz: |