PaperPulse logo
FeedTopicsAI Researcher FeedBlogPodcastAccount

Stay Updated

Get the latest research delivered to your inbox

Platform

  • Home
  • About Us
  • Search Papers
  • Research Topics
  • Researcher Feed

Resources

  • Newsletter
  • Blog
  • Podcast
PaperPulse•

AI-powered research discovery platform

© 2024 PaperPulse. All rights reserved.

Semantic Tree Inference on Text Corpa using a Nested Density Approach together with Large Language Model Embeddings

ArXivSource

Thomas Haschka, Joseph Bakarji

cs.CL
cs.AI
|
Dec 29, 2025
6 views

One-line Summary

The paper introduces a nested density clustering method to create hierarchical semantic trees from text corpora using large language model embeddings, enabling data-driven discovery of research areas and their subfields without predefined categories.

Plain-language Overview

Recent advancements in large language models (LLMs) have improved how we classify and understand text based on its meaning. However, understanding the overall structure of semantic relationships in large text collections remains challenging. This study proposes a new method to create hierarchical trees that show how different texts are related to each other in terms of meaning. By using this method, researchers can uncover the natural organization of research areas and subfields without needing to categorize them in advance. The approach is tested on various datasets, showing its effectiveness across different types of text collections.

Technical Details