feat: Migrate document parsing to Aliyun and update embedding configurations
- Updated LocalDocumentParser to include raw_layouts and artifact_prefix from settings. - Added new documents with failure reasons and metadata to documents.json for better error tracking. - Created a new documentation file detailing the Aliyun ingest implementation process. - Updated RFC to reflect changes in the parsing backend and embedding dimensions. - Modified tests to accommodate the new embedding dimension of 1024 and updated parser and chunk builder assertions. - Verified migration configurations to ensure correct settings for embedding model and backend.
This commit is contained in:
@@ -32,6 +32,10 @@ async def get_config():
|
||||
"embedding_dim": settings.embedding_dim,
|
||||
"embedding_base_url": settings.embedding_base_url,
|
||||
"milvus_collection": settings.milvus_collection,
|
||||
"parser_backend": settings.parser_backend,
|
||||
"chunk_backend": settings.chunk_backend,
|
||||
"artifact_prefix": settings.document_parse_artifact_prefix,
|
||||
"parser_failure_mode": settings.parser_failure_mode,
|
||||
"llm_provider": settings.llm_provider,
|
||||
"llm_model": settings.llm_model,
|
||||
"document_metadata_path": settings.document_metadata_path,
|
||||
|
||||
Reference in New Issue
Block a user