Supported File Extensions and LlamaIndex Loaders
This document lists the file extensions found in the ./../../../data directory and the corresponding LlamaIndex loaders that can be used to process them.
Document Formats
| Extension |
File Type |
LlamaIndex Loader |
Installation Package |
.pdf |
Portable Document Format |
PDFReader or SimpleDirectoryReader |
llama-index-readers-file |
.docx |
Microsoft Word Document |
DocxReader or SimpleDirectoryReader |
llama-index-readers-file |
.xlsx |
Microsoft Excel Spreadsheet |
PandasExcelReader or SimpleDirectoryReader |
llama-index-readers-file |
.pptx |
Microsoft PowerPoint Presentation |
PptxReader or SimpleDirectoryReader |
llama-index-readers-file |
.odt |
OpenDocument Text |
SimpleDirectoryReader with UnstructuredReader |
llama-index-readers-file |
Image Formats
| Extension |
File Type |
LlamaIndex Loader |
Installation Package |
.png |
Portable Network Graphics |
ImageReader |
llama-index-readers-file |
.jpg |
JPEG Image |
ImageReader |
llama-index-readers-file |
Archive Formats
| Extension |
File Type |
LlamaIndex Loader |
Installation Package |
.zip |
ZIP Archive |
SimpleDirectoryReader with archive support |
llama-index-readers-file |
System/Special Files (Ignored)
.DS_Store - macOS system file
.gitignore - Git configuration file
Audio/Video Formats (Skipped as per requirements)
.m4a - Audio file
.mp3 - Audio file
.mp4 - Video file
.ogg - Audio/Video file
Notes
- Many file types can be loaded using the
SimpleDirectoryReader which automatically detects and handles multiple file formats.
- For advanced document parsing, specific readers might offer better performance or more features.
- All required dependencies have been installed with
llama-index-readers-file and patool for archive support.
- No external API keys are required for the supported file types, as we're using local processing solutions.
- The system prioritizes local processing over cloud services as per requirements.