- E-postsamia.touileb@uib.no
- BesøksadresseFosswinckels gate 6Lauritz Meltzers hus5007 BergenRom516
- PostadressePostboks 78025020 Bergen
Samia Touileb er førsteamanuensis innen språkteknologi (Natural Language Processing på Engelsk). Før dette var hun forsker ved MediaFutures (WP5 -- norsk språkteknologi), og postdoktor ved Språkteknologigruppen (LTG), Institutt for informatikk ved Universitetet i Oslo. Hun har en doktorgrad i språkteknologi fra Universitetet i Bergen.
Hennes hoved forskningsinteresser inkluderer skjevhet og rettferdighet i modeller innen språkteknologi, informasjonsekstraksjon, automatisk generering av sammendrag, og anvendelser av språkteknologiske- og maskinlæringsmetoder innen samfunnsvitenskapelig forskning.
- (2023). Learning Horn envelopes via queries from language models. International Journal of Approximate Reasoning. 20 sider.
- (2016). ADIOS LDA: When Grammar Induction Meets Topic Modeling. NIKT: Norsk IKT-konferanse for forskning og utdanning.
- (2014). Inducing Information Structures for Data-driven Text Analysis. Association for Computational Linguistics (ACL). Annual Meeting Conference Proceedings.
- (2014). Applying grammar induction to text mining. Association for Computational Linguistics (ACL). Annual Meeting Conference Proceedings. 712-717.
- (2023). The Societal and Ethical Implications of Language Models.
- (2023). The Ethics of Large Language Models.
- (2023). Sosiale og etiske utfordringer med språkmodeller som ChatGPT.
- (2023). Når kunstig intelligens inntar redaksjonen.
- (2023). Demystifying ChatGPT and language models.
- (2023). ChatGPT: teknologien, datasettet, og det vi (ikke) vet.
- (2023). ChatGPT: teknologien, datasettet, og det vi (ikke) vet.
- (2023). ChatGPT & AI in education.
- (2023). Big Science Gullgruve eller fallgruve?
- (2023). Benchmarking the societal and ethical implications of large language model.
- (2016). Getting to know large newsflows: Automatically induced information structures as keyphrases for news content analysis.
- (2012). Networks of texts and people.
- (2023). Store språkmodeller: muligheter og utfordringer.
- (2023). Sosiale og etiske utfordringer med språkmodeller .
- (2023). Hva er ChatGPT og hvordan fungerer det og lignende verktøy?
- (2023). Blir vi overflødige? En samtale om kunstig intelligens og utdanning.
- (2023). Large Language models: What are they, and what are their ethical implications?
- (2018). Operationalising Diversity for Big Data Policy Research.
- (2017). Finding Voices in the Margins: Computer-Assisted Discovery of Naturally Belonging Names .
- (2015). Computer supported deliberation and argumentation online. Proposing a system for online argumentation.
- (2013). Inducing local grammars from n-grams.
- (2023). Proceedings of the 5th Symposium of the Norwegian AI Society (NAIS 2023). NAIS Norwegian Artificial Intelligence Society.
- (2021). Proceedings of the Sixth Arabic Natural Language Processing Workshop. Association for Computational Linguistics.
- (2023). KI-dyret må mates med varsomhet. M24.
- (2023). Chat GPT egner seg dårlig til eksamenssensuren. Morgenbladet.
- (2017). Automatically Inducing Information Structures. A Text Mining Approach Based on the Distributional Hypothesis.
- (2023). Kunstig intelligens: Krever åpenhet og integritet.
- (2023). NorBench – A Benchmark for Norwegian Language Models. 16 sider.
- (2023). Measuring normative and descriptive biases in language models using census data.
- (2023). Making sense of nonsense : Integrated gradient-based input reduction to improve recall for check-worthy claim detection. 13 sider.
- (2023). JSEEGraph: Joint Structured Event Extraction as Graph Parsing.
- (2023). Identifying Token-Level Dialectal Features in Social Media. 13 sider.
- (2023). Automated Claim Detection for Fact-checking: A Case Study using Norwegian Pre-trained Language Models.
- (2023). Arabic dialect identification: An in-depth error analysis on the MADAR parallel corpus. 15 sider.
- (2022). Occupational Biases in Norwegian and Multilingual Language Models. 12 sider.
- (2022). NorDiaChange: Diachronic Semantic Change Dataset for Norwegian. 10 sider.
- (2022). NERDz: A Preliminary Dataset of Named Entities for Algerian. 7 sider.
- (2022). Measuring Harmful Representations in Scandinavian Language Models. 8 sider.
- (2022). Exploring the Effects of Negation and Grammatical Tense on Bias Probes . 7 sider.
- (2022). EventGraph: Event Extraction as Semantic Graph Parsing. 9 sider.
- (2022). EventGraph at CASE 2021 Task 1: A General Graph-based Approach to Protest Event Extraction. 6 sider.
- (2022). Annotating Norwegian language varieties on Twitter for Part-of-speech. 6 sider.
- (2021). Using Gender- and Polarity-Informed Models to Investigate Bias. 9 sider.
- (2021). The interplay between language similarity and script on a novel multi-layer Algerian dialect corpus. 13 sider.
- (2021). NorDial: A Preliminary Corpus of Written Norwegian Dialect Use. 7 sider.
- (2020). Named Entity Recognition without Labelled Data: A Weak Supervision Approach . 16 sider.
- (2020). LTG-ST at NADI Shared Task 1: Arabic Dialect Identification using a Stacking Classifier. 7 sider.
- (2020). Identifying Sentiments in Algerian Code-switched User-generated Comments. 8 sider.
- (2020). Gender and sentiment, critics and authors: a dataset of Norwegian book reviews. 14 sider.
- (2019). Measuring Diachronic Evolution of Evaluative Adjectives with Word Embeddings: the Case for English, Norwegian, and Russian. 8 sider.
- (2019). Lexicon information in neural sentiment analysis: a multi-task learning approach. 12 sider.
- (2018). NoReC: The Norwegian Review Corpus. 6 sider.
- (2018). Automatic identification of unknown names with specific roles. 9 sider.
- (2014). Constructions: a new unit of analysis for corpus-based discourse analysis . 11 sider.
- (2021). Using Gender- and Polarity-informed Models to Investigate Bias.
- (2018). Automatically identifying names of unrecognized politicians.
- (2015). A computational approach to organize and analyze online communication data.
- (2013). Applying Corpus Techniques to Climate Change Blogs.
- (2024). Large Language Models and their usage in EAL education. 139-160. I:
- (2024). Current Issues in English Teaching. Fagbokforlaget.
OPINION COST action: https://www.cost.eu/actions/CA21129/
MediaFutures: https://mediafutures.no/2021/01/20/postdoc-samia-touileb/
NorDial: https://github.com/jerbarnes/nordial