Samia Touileb's picture

Samia Touileb

Researcher, MediaFutures: Research Centre for Responsible Media Technology & Innovation
  • E-mailSamia.Touileb@uib.no
  • Phone+47 55 58 41 31
  • Visitor Address
    Fosswinckels gate 6
  • Postal Address
    Postboks 7802
    5020 Bergen

Samia Touileb is currently a researcher in MediaFutures WP5 on Norwegian Language Technologies. Prior to this she was a Postdoc at the Language Technology Group (LTG), Department of Informatics, at the University of Oslo. She holds a PhD in Natural Language Processing (NLP) from the University of Bergen, and has been working within research in and applications of NLP for almost a decade.

Her main research interests are information extraction, sentiment analysis, bias and fairness in NLP, and applications of NLP and machine learning methods to tasks within social science research. She also mainly works on under-resourced languages such as Norwegian.

Academic article
  • Show author(s) (2016). ADIOS LDA: When Grammar Induction Meets Topic Modeling. NIKT: Norsk IKT-konferanse for forskning og utdanning.
  • Show author(s) (2014). Inducing Information Structures for Data-driven Text Analysis. Association for Computational Linguistics (ACL). Annual Meeting Conference Proceedings.
  • Show author(s) (2014). Applying grammar induction to text mining. Association for Computational Linguistics (ACL). Annual Meeting Conference Proceedings. 712-717.
  • Show author(s) (2016). Getting to know large newsflows: Automatically induced information structures as keyphrases for news content analysis.
  • Show author(s) (2012). Networks of texts and people.
Academic lecture
  • Show author(s) (2018). Operationalising Diversity for Big Data Policy Research.
  • Show author(s) (2017). Finding Voices in the Margins: Computer-Assisted Discovery of Naturally Belonging Names .
  • Show author(s) (2015). Computer supported deliberation and argumentation online. Proposing a system for online argumentation.
  • Show author(s) (2013). Inducing local grammars from n-grams.
Academic anthology/Conference proceedings
  • Show author(s) (2021). Proceedings of the Sixth Arabic Natural Language Processing Workshop. Association for Computational Linguistics.
Doctoral dissertation
  • Show author(s) (2017). Automatically Inducing Information Structures. A Text Mining Approach Based on the Distributional Hypothesis.
Academic chapter/article/Conference paper
  • Show author(s) (2021). Using Gender- and Polarity-Informed Models to Investigate Bias. 9 pages.
  • Show author(s) (2021). The interplay between language similarity and script on a novel multi-layer Algerian dialect corpus. 13 pages.
  • Show author(s) (2021). NorDial: A Preliminary Corpus of Written Norwegian Dialect Use. 7 pages.
  • Show author(s) (2020). Named Entity Recognition without Labelled Data: A Weak Supervision Approach . 16 pages.
  • Show author(s) (2020). LTG-ST at NADI Shared Task 1: Arabic Dialect Identification using a Stacking Classifier. 7 pages.
  • Show author(s) (2020). Identifying Sentiments in Algerian Code-switched User-generated Comments. 8 pages.
  • Show author(s) (2020). Gender and sentiment, critics and authors: a dataset of Norwegian book reviews. 14 pages.
  • Show author(s) (2019). Measuring Diachronic Evolution of Evaluative Adjectives with Word Embeddings: the Case for English, Norwegian, and Russian. 8 pages.
  • Show author(s) (2019). Lexicon information in neural sentiment analysis: a multi-task learning approach. 12 pages.
  • Show author(s) (2018). NoReC: The Norwegian Review Corpus. 6 pages.
  • Show author(s) (2018). Automatic identification of unknown names with specific roles. 9 pages.
  • Show author(s) (2014). Constructions: a new unit of analysis for corpus-based discourse analysis . 11 pages.
  • Show author(s) (2021). Using Gender- and Polarity-informed Models to Investigate Bias.
  • Show author(s) (2018). Automatically identifying names of unrecognized politicians.
  • Show author(s) (2015). A computational approach to organize and analyze online communication data.
  • Show author(s) (2013). Applying Corpus Techniques to Climate Change Blogs.

More information in national current research information system (CRIStin)