Skip to content

Commit 2400cb0

Browse files
Copilotjosix
andcommitted
Regenerate CSV files with proper Python terminology and consolidation approach
Co-authored-by: josix <18432820+josix@users.noreply.github.com>
1 parent f722995 commit 2400cb0

File tree

3 files changed

+336
-17940
lines changed

3 files changed

+336
-17940
lines changed

TERMINOLOGY_DICTIONARY.md

Lines changed: 22 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -18,20 +18,20 @@ The complete terminology dictionary containing important terms identified from P
1818
- **directory**: Directory of the source file
1919
- **example_files**: List of up to 5 files containing this term
2020

21-
Total entries: ~14,700 unique terms
21+
Total entries: ~196 essential Python terms
2222

2323
### focused_terminology_dictionary.csv
24-
A curated subset of ~2,900 terms focusing on the most important Python terminology. Includes additional columns:
24+
A curated subset of ~118 terms focusing on the most important Python terminology. Includes additional columns:
2525
- **priority**: High/Medium priority classification
2626
- **category**: Term classification
2727

2828
#### Categories:
2929
- **Core Concepts** (7 terms): class, function, method, module, package, object, type
3030
- **Built-in Types** (9 terms): int, str, list, dict, tuple, set, float, bool, complex
31-
- **Keywords/Constants** (8 terms): None, True, False, return, import, def, async, await
32-
- **Exceptions** (690 terms): All *Error and *Exception terms
33-
- **Code Elements** (825 terms): Terms in backticks, magic methods
34-
- **Common Terms** (1,365 terms): Frequently used technical terms
31+
- **Keywords/Constants** (25 terms): None, True, False, return, import, def, async, await, and other Python keywords
32+
- **Exceptions** (29 terms): Common *Error and *Exception classes
33+
- **Code Elements** (14 terms): Magic methods like __init__, __str__, etc.
34+
- **Common Terms** (34 terms): Important technical concepts like decorator, generator, iterator
3535

3636
## Maintenance
3737

@@ -59,23 +59,21 @@ CSV files use UTF-8 encoding to properly handle Chinese characters. Compatible w
5959

6060
## Maintenance
6161

62-
### Adding New Patterns
63-
To extend pattern recognition, modify `extract_key_terms()` function in `extract_terminology.py`:
64-
65-
```python
66-
# Add new technical patterns
67-
tech_patterns = [
68-
r'\b(?:new_pattern_here)\b',
69-
# existing patterns...
70-
]
71-
```
72-
73-
### Adjusting Filters
74-
Modify filtering criteria in `is_significant_term()` and `create_focused_dictionary()` functions.
75-
76-
### Performance Optimization
77-
- Current processing: ~509 files in 2-3 minutes
78-
- Memory usage: ~50MB peak
79-
- Scalable to larger repositories
62+
### Adding New Terms
63+
New terms can be identified and added based on:
64+
- Frequency of appearance in documentation
65+
- Importance to Python concepts
66+
- Consistency needs across translation files
67+
68+
### Manual Curation Process
69+
The dictionaries are maintained through careful analysis of:
70+
- Core Python terminology in official documentation
71+
- Existing translation patterns in .po files
72+
- Category-based organization for translator efficiency
73+
74+
### Quality Assurance
75+
- Regular review of term translations for consistency
76+
- Cross-reference with official Python terminology
77+
- Validation against established translation conventions
8078

8179
This documentation provides comprehensive guidance for maintaining and using the translation dictionary system to ensure consistent, high-quality Python documentation translation.

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy