GitHub RepoDecember 11, 2025 at 08:58 AMImpressions2.3k

Paper to Slide/Presentation converter in One Click

@githubprojectsPost Author

Project Description

2 PostsID: ac8b8bd9-30b7-4acf-a135-35cc2092cacb

Paper2Slides: Turn Research Papers into Presentations in One Click

Let's be honest: turning a dense academic paper into a clear, engaging presentation is a chore. You're flipping between sections, trying to distill complex ideas into bullet points, and hunting for key figures—all while the clock ticks toward your deadline. What if you could skip that grunt work entirely?

Enter Paper2Slides, an open-source tool that automates the heavy lifting. It takes a research paper (PDF) and generates a structured slide deck for you. It's not just a simple text extractor; it intelligently identifies the core components of a paper and maps them into a standard presentation format.

What It Does

Paper2Slides is a Python-based tool that processes a PDF of an academic paper and outputs a PowerPoint presentation (.pptx). It uses a combination of layout analysis and natural language processing to break the paper down. It looks for the standard sections—Abstract, Introduction, Methodology, Results, Conclusion—and pulls out the most salient sentences and figures. Then, it organizes this content into a logical slide flow, complete with titles, bullet points, and embedded images.

The goal isn't to produce a final, polished presentation you'd use without review. Instead, it gives you a robust, coherent first draft in seconds, saving you hours of manual copying, pasting, and formatting.

Why It's Cool

The clever part is in the pipeline. It doesn't just naively split text. First, it parses the PDF to understand its structure, separating text blocks from figures and tables. Then, it classifies which part of the paper each block belongs to. Using NLP techniques, it scores sentences within each section for importance, selecting the ones that best summarize the content for a slide format. Finally, it stitches it all together with the python-pptx library, creating clean, standardized slides.

For developers and researchers, this is a neat example of a practical NLP/ML pipeline applied to a real-world problem. The code is modular, so you could tweak the sentence selection algorithm, adjust the slide template, or even retrain the section classifier for different types of documents. It’s a great starting point for anyone interested in document understanding automation.

How to Try It

The project is on GitHub, and getting it running is straightforward if you have a Python environment.

Clone the repo:

git clone https://github.com/HKUDS/Paper2Slides.git
cd Paper2Slides

Install the required packages (check the requirements.txt in the repo):
```
pip install -r requirements.txt
```
The main script is paper2slides.py. Run it with the path to your target PDF:
```
python paper2slides.py --pdf_path /path/to/your/paper.pdf
```

A .pptx file will be generated in the same directory. You'll want to open it up, refine the content, and adjust the design, but the core structure and extracted highlights will already be there.

Final Thoughts

As a developer, I see tools like Paper2Slides as incredibly useful productivity boosters. It tackles a specific, time-consuming task with a pragmatic, automated approach. While you'll still need to add your own insights and polish, eliminating the initial scaffolding work is a huge win.

It's also a well-structured project for learning. You can see how the creators connected PDF parsing, text analysis, and document generation. You could fork it and adapt it for business reports, technical documentation, or any other structured long-form content. It’s a solid piece of open-source utility that does one job, and does it well.

Check out the code, run it on your last paper, and see how many hours it saves you.

Follow us for more cool projects: @githubprojects

Repository: https://github.com/HKUDS/Paper2Slides

Contributors

@githubprojects

2

Total PostsPosts

1

ContributorsUsers

December 11

CreatedDate

Back to Projects

Project ID: ac8b8bd9-30b7-4acf-a135-35cc2092cacbLast updated: December 11, 2025 at 08:58 AM