User talk:نعم البدل
سماپت
[edit]Three uses of this term are: 1:https://www.rekhta.org/nazms/haat-dopahar-tak-v-sudhakar-rao-nazms?lang=ur 2:https://www.facebook.com/103918755700987/posts/113286171430912/?mibextid=jtWzXIAxfKx1VBOC 3:https://www.dw.com/ur/%D9%85%D8%A7%DA%BA-%D8%AC%DB%8C-%DA%A9%D8%A7-%D8%B1%DB%8C%DA%88%DB%8C%D9%88-%D8%A8%DA%86%D9%BE%D9%86-%DA%A9%D8%A7-%D8%A8%DB%8C-%D8%A8%DB%8C-%D8%B3%DB%8C-%D8%A7%D9%88%D8%B1-%D8%B1%D8%AD%DB%8C%D9%85-%D8%A7%D9%84%D9%84%DB%81-%DB%8C%D9%88%D8%B3%D9%81%D8%B2%D8%A6%DB%8C-%DA%A9%DB%8C-%D8%A2%D9%88%D8%A7%D8%B2/a-59501164 4:https://books.google.com.bd/books?id=VyyZgZFz3cQC&q=%D8%B3%D9%85%D8%A7%D9%BE%D8%AA&dq=%D8%B3%D9%85%D8%A7%D9%BE%D8%AA&hl=en&newbks=1&newbks_redir=0&source=gb_mobile_search&ovdme=1&sa=X&ved=2ahUKEwi218_g_rmDAxXvRmwGHWt0BPEQ6AF6BAgMEAM#%D8%B3%D9%85%D8%A7%D9%BE%D8%AA গহীনঅরণ্য (talk) 09:55, 1 January 2024 (UTC)
'Old Hindi' (continued, part 3)
[edit]- Let me know if it makes more sense for Middle Hindi to be a full-fledged language rather than an etymological language. CAT:Urdu terms derived from Middle Hindi has over 20 entries now. Middle Hindi would need to be a full-fledged language if there is a reason to make a distinction between Middle Hindi and Old Hindi such as for spellings or quotations. However, it is important to remember what you said earlier:
[this may simply highlight] the borrowings during the different stages (and the changes in meanings), however, there's not much difference between native Old Hindi vocab
- There is at least one potential issue with making Middle Hindi a full-fledged language. Although Old Braj
(bra-old)
could possibly be a descendant of Old Hindi, it cannot be a descendant of Middle Hindi because they were contemporaneous. However, since there is very little coverage of Braj, this is not a pressing issue.
- I created my first Perso-Arabic Middle Hindi entry at Old Hindi فَرْمان (frman). Having Middle Hindi as an etymological language with the parent of Old Hindi seems to be working for entries and descendants trees because it being treated as a later variety of Old Hindi. Perhaps showing an unattested Devanagari equivalent on the headword line would not be appropriate since Old Hindi and Middle Hindi are not classical languages.
Kutchkutch (talk) 20:51, 1 January 2024 (UTC)
- @Kutchkutch::
Let me know if it makes more sense for Middle Hindi to be a full-fledged language rather than an etymological language
– For the moment, I don't see a need to make Middle Hindi a fully-fledged language. I don't intend on making any specific Middle Hindi lemmas so making it subordinate to Old Hindi is fine in my opinion, because for the moment I do like that there's a bit of leniency between Old Hindi and Middle Hindi. Making Middle Hindi a fully fledged language even though there's not much to go off for the minute, might seem a bit haste, unless you have some plans for it. A lot of Urdu lemmas need to be either sorted out or added as well, which a bit higher on my to-do list. نعم البدل (talk) 22:09, 1 January 2024 (UTC)
Making Middle Hindi a fully fledged language even though there's not much to go off for the minute, might seem a bit haste
- Yes, that's true, especially considering that there's not much material on Middle Hindi in Devanagari. I just wanted to see what you have to say about it, so thanks for the input.
A lot of Urdu lemmas need to be either sorted out or added as well, which a bit higher on my to-do list
- Creating entries for historical languages may certainly require more time compared to modern languages.
- Kutchkutch (talk) 00:51, 2 January 2024 (UTC)
- @Kutchkutch There's also the issue of Middle Hindi [Term?] linking to Wikipedia but no such article exists. 178.120.10.176 21:06, 10 June 2024 (UTC)
Terms from Prakrit
[edit]- For Urdu & Shahmukhi Punjabi etymologies from Prakrit would it be better to show Prakrit in the Brahmi script or Devanagari? If there is no reason to choose one or the other, then Brahmi would be preferable since that would have been the original script when Prakrit was a spoken language.
- My impression is that speakers of [Pakistani] Urdu and Punjabi are generally unfamiliar with Devanagari, while Indian speakers of Urdu & Punjabi are generally familiar with Devanagari. If Devanagari is to be displayed in etymology sections instead of Brahmi it can be done so as
{{inh|ur|pra|𑀡𑀺𑀤𑁆𑀤𑀸|णिद्दा}}
or{{inh|ur|pra|णिद्दा}}
if there is a Devanagari redirect page.
- Thanks for editing the descendants tree at the Sanskrit entry निद्रा#Descendants. However, since there is a large descendants tree at the Prakrit entry 𑀡𑀺𑀤𑁆𑀤𑀸#Descendants, perhaps it would be better to put all the Prakrit-derived descendants at the Prakrit entry and put
{{desc|pra|𑀡𑀺𑀤𑁆𑀤}}
{{see desc}}
next to the Prakrit term at the Sanskrit entry. Kutchkutch (talk) 21:12, 1 January 2024 (UTC)would it be better to show Prakrit in the Brahmi script or Devanagari?
– It makes no difference to me. If you're referring to my change at نِیند (nīnd), it wasn't meant to be anything objective, lol – I was trying something out and forgot to change the Prakrit lemma back to the Brahmi script. Brahmi or Devanagari, either is fine.perhaps it would be better to put all the Prakrit-derived descendants at the Prakrit entry and put
Oh right, yeah a desctree would probably be better, I was just fixing the Urdu and Punjabi lemmas! نعم البدل (talk) 22:13, 1 January 2024 (UTC)
- ↑
speakers of Urdu and Punjabi are generally unfamiliar with Devanagari...
- what was meant was: speakers of Pakistani Urdu and Punjabi are generally unfamiliar with Devanagari...
it wasn't meant to be anything objective, lol...Brahmi or Devanagari, either is fine
- Thanks for the clarification. If you didn't know already, there exists a typing aid at MOD:typing-aids/data/inc-pra for Brahmi, so
{{subst:chars|pra|NiddA}}
and{{subst:chars|pra|ṇiddā}}
both display 𑀡𑀺𑀤𑁆𑀤𑀸.
- Thanks for the clarification. If you didn't know already, there exists a typing aid at MOD:typing-aids/data/inc-pra for Brahmi, so
I was just fixing the Urdu and Punjabi lemmas!
- That is understandable because if the intention is to just fix one or two languages, then restructuring the entire descendants tree is too large of a task.
- Kutchkutch (talk) 00:45, 2 January 2024 (UTC)
- ↑
vocalisation of ہوے
[edit]Hi,
Just a friendly reminder-request to review the inflection of ہونا (honā) per this revision, if you're still interested in it.
Also Module:number list/data/ur has a number of numerals with bad or no vocalisations and no transliterations. If it's too much work to review, maybe this module should be reduced (0 to 10 or 10 to 20, etc) or deleted?
To make a template similar to {{hi-cardinals}}
, the above list needs to be cleaned. Anatoli T. (обсудить/вклад) 05:53, 3 January 2024 (UTC)
- @Atitarev Yess! I'm sorry it went over my head. On a first glance, it does contain mistakes. Could I ask why you didn't just use Template:ur-conj-v? And yes, I fix the number module for Urdu! نعم البدل (talk) 19:17, 4 January 2024 (UTC)
- @نعم البدل: Thanks for the response. I'm note if
{{ur-conj-v}}
produces the correct results for this verb. - Please ping me when you're done with the list or decide to shorten/abandon it. Anatoli T. (обсудить/вклад) 22:29, 4 January 2024 (UTC)
- Hi I have made a couple of small edits to Module:number list/data/ur but it's not easy to edit. I recommend copying to text to a some sandbox page. Anatoli T. (обсудить/вклад) 00:14, 8 January 2024 (UTC)
- @نعم البدل: Thanks for the response. I'm note if
Quadmix77
[edit]I noticed this user has been adding Punjabi pronunciations recently, could you check them out when you have time? Rodrigo5260 (talk) 15:01, 7 January 2024 (UTC)
"ai"
[edit]Sorry for another ping. Is this a correct vocalisation in Urdu to produce "ai": نَِک ٹائی (naik ṭāī)? Found this in Rekhta. Anatoli T. (обсудить/вклад) 00:16, 8 January 2024 (UTC)
- @Atitarev: [ɛ] (not prolonged) in foreign words, usually becomes a short 'e' ([e] or [ɛ]) in Urdu which is typically represented with a kasrah, so the correct vocalisation would be نِک ٹائی (nik ṭāī) or نِکْٹائِی (nikṭāī). Rekhta and UDB include both fatha, and a kasrah (which I assumed is kept from older dictionaries) to represent both spellings but that's because neither a fatha, nor a kasrah is exactly [ɛ]. نعم البدل (talk) 00:38, 8 January 2024 (UTC)
- Thanks! I have correct to use a kasra at necktie#Translations. I've also added both نیک ٹائی (nek ṭāī) and نیکْٹائی (nekṭāī). So the Urdu section shows multiple variants now. Anatoli T. (обсудить/вклад) 01:02, 8 January 2024 (UTC)
- Oh, I see you corrected to نَیکْٹائی (naikṭāī). There is a Hindi spelling नेकटाई (nekṭāī) and wouldn't be نیکْٹائی (nekṭāī) more accurate - matching both English and Hindi? (It doesn't have to match, of course, you will know better).
- Rekhta uses نِیک ٹائی (nīk ṭāī), which doesn't seem right. Anatoli T. (обсудить/вклад) 01:09, 8 January 2024 (UTC)
- @Atitarev: I've noticed that Hindi borrowings from English tend to be a mix of pronunciation and transliteration. Urdu is mainly pronunciation, and the spelling becomes approximated to the nearest vowel. I'm actually wrong in this case, since necktie has evolved into نِک ٹائی (nik ṭāī) but also نِیک ٹائی (nīk ṭāī) not نَیک ٹائی (naik ṭāī). نعم البدل (talk) 01:16, 8 January 2024 (UTC)
- Thanks! I have correct to use a kasra at necktie#Translations. I've also added both نیک ٹائی (nek ṭāī) and نیکْٹائی (nekṭāī). So the Urdu section shows multiple variants now. Anatoli T. (обсудить/вклад) 01:02, 8 January 2024 (UTC)
سمکالین
[edit]Greetings
Three uses of this term are: 1: https://books.google.com.bd/books?id=pv6xEAAAQBAJ&pg=PT30&lpg=PT30&dq=%D8%B3%D9%85%DA%A9%D8%A7%D9%84%DB%8C%D9%86+urdu&source=bl&ots=KbeDF0GiZQ&sig=ACfU3U3aM5QvSN4dCZiYix8Cz0BQ_iuc-Q&hl=en&sa=X&ved=2ahUKEwijl_Hf4tSDAxX2zDgGHTznA604ChDoAXoECAcQAg#v=onepage&q=%D8%B3%D9%85%DA%A9%D8%A7%D9%84%DB%8C%D9%86%20urdu&f=false 2: https://www.hindsamachar.in/national-news/news/forced-population-displacement-is-against-human-development-38019 3: https://m.thewireurdu.com/article/ravi/55520
These term must be in Urdu Wiktionary Kingfolker255 (talk) 17:16, 14 January 2024 (UTC)
- @Kingfolker255: – You will have to point out where the word is being employed in the first citation. The second citation is valid. The third one is a quote, please refer to the Use-Mention distinction policy. نعم البدل (talk) 09:11, 31 January 2024 (UTC)
- The uses of this term are found in old works are on these links
- 1:
- https://books.google.com.bd/books?id=pv6xEAAAQBAJ&pg=PT30&lpg=PT30&dq=%D8%B3%D9%85%DA%A9%D8%A7%D9%84%DB%8C%D9%86+urdu&source=bl&ots=KbeDF0GiZQ&sig=ACfU3U3aM5QvSN4dCZiYix8Cz0BQ_iuc-Q&hl=en&sa=X&ved=2ahUKEwijl_Hf4tSDAxX2zDgGHTznA604ChDoAXoECAcQAg#v=onepage&q=%D8%B3%D9%85%DA%A9%D8%A7%D9%84%DB%8C%D9%86%20urdu&f=false
- 2:
- https://www.hindsamachar.in/national-news/news/forced-population-displacement-is-against-human-development-38019
- 3:
- https://www.hindsamachar.in/top-news/news/india-s-technological-advancement-self-sufficiency-bandaru-dattatreya-30904
- 4:
- https://m.thewireurdu.com/article/ravi/55520
- 5:
- https://books.google.com.bd/books?id=lXmYas5iWWMC&q=%D8%AA%D8%AA%DA%A9%D8%A7%D9%84%DB%8C%D9%86+urdu&dq=%D8%AA%D8%AA%DA%A9%D8%A7%D9%84%DB%8C%D9%86+urdu&hl=en&newbks=1&newbks_redir=0&source=gb_mobile_search&ovdme=1&sa=X&ved=2ahUKEwiBiY7PpoeEAxUqwTgGHSTpBD8Q6AF6BAgFEAM#%D8%AA%D8%AA%DA%A9%D8%A7%D9%84%DB%8C%D9%86%20urdu
- This are the citations. Kingfolker255 (talk) 09:27, 31 January 2024 (UTC)
- @Kingfolker255: Please see my response at Citations:سمکالین. I would advise you not to add transliterations of Hindi lemmas as Urdu lemmas. نعم البدل (talk) 09:31, 31 January 2024 (UTC)
- Thanks. Kingfolker255 (talk) 09:35, 31 January 2024 (UTC)
- @Kingfolker255: Please see my response at Citations:سمکالین. I would advise you not to add transliterations of Hindi lemmas as Urdu lemmas. نعم البدل (talk) 09:31, 31 January 2024 (UTC)
- Let me re linking the uses:
- 1
- https://books.google.com.bd/books?id=pv6xEAAAQBAJ&pg=PT30&lpg=PT30&dq=%D8%B3%D9%85%DA%A9%D8%A7%D9%84%DB%8C%D9%86+urdu&source=bl&ots=KbeDF0GiZQ&sig=ACfU3U3aM5QvSN4dCZiYix8Cz0BQ_iuc-Q&hl=en&sa=X&ved=2ahUKEwijl_Hf4tSDAxX2zDgGHTznA604ChDoAXoECAcQAg#v=snippet&q=%D8%B3%D9%85%DA%A9%D8%A7%D9%84%DB%8C%D9%86%20&f=false
- 2:
- https://www.hindsamachar.in/national-news/news/forced-population-displacement-is-against-human-development-38019
- 3:
- https://www.hindsamachar.in/top-news/news/india-s-technological-advancement-self-sufficiency-bandaru-dattatreya-30904
- 4:
- https://books.google.com.bd/books?id=RdVjAAAAMAAJ&q=%D8%B3%D9%85%DA%A9%D8%A7%D9%84%DB%8C%D9%86&dq=%D8%B3%D9%85%DA%A9%D8%A7%D9%84%DB%8C%D9%86&hl=en&newbks=1&newbks_redir=0&source=gb_mobile_search&ovdme=1&sa=X&ved=2ahUKEwi1iqiNqYeEAxVgQ2cHHfGYDL4Q6AF6BAgLEAM#%D8%B3%D9%85%DA%A9%D8%A7%D9%84%DB%8C%D9%86
- 5:
- https://books.google.com.bd/books?id=nKhjAAAAMAAJ&q=%D8%B3%D9%85%DA%A9%D8%A7%D9%84%DB%8C%D9%86&dq=%D8%B3%D9%85%DA%A9%D8%A7%D9%84%DB%8C%D9%86&hl=en&newbks=1&newbks_redir=0&source=gb_mobile_search&ovdme=1&sa=X&ved=2ahUKEwjbsqbZk4mEAxUgamwGHSe0D-AQ6AF6BAgJEAM#%D8%B3%D9%85%DA%A9%D8%A7%D9%84%DB%8C%D9%86 Kingfolker255 (talk) 04:48, 3 February 2024 (UTC)
- @Kingfolker255: These are the same ones you gave earlier on? I explained which were, and which weren't suitable. نعم البدل (talk) 07:44, 3 February 2024 (UTC)
Saraiki/Punjabi Transliteration
[edit]Hi,
The Shahmukhi transliteration module isn't working properly, it's still in beta but
ٻارْھواں (ḇārhoāṉ) بارْھواں (bārhvā̃)
And I don't know if it's part of this module but it doesn't remove the "ghunna diacritic ٘ " when opening a link Notevenkidding (talk) 16:37, 15 February 2024 (UTC)
- @Sameerhameedy – Would it be okay for your Module:ur-translit to be copied over to Module:pa-Arab-translit and changed as per WT:PA TR? نعم البدل (talk) 17:20, 19 February 2024 (UTC)
- @نعم البدل ofc! go ahead! - سَمِیر | Sameer (مشارکتها · بحث) 18:19, 19 February 2024 (UTC)
Hi,
I am not sure شَمْع (śam') is transliterated correctly, even if there is शमा (śamā). Could you please check the entry? Anatoli T. (обсудить/вклад) 05:13, 4 March 2024 (UTC)
- @Atitarev: The transliteration was based on the Hindi lemma. I've based it on the Urdu transliteration now. (I'll fix the page later)! نعم البدل (talk) 04:53, 5 March 2024 (UTC)
bad characters
[edit]Hi. Please be careful adding roots; e.g. this diff [1] added a blacklisted version of the Arabic letter mīm, and the resulting root category can't be created. I'm not sure how this is happening but I fixed about 12-15 such instances a few days ago. Benwing2 (talk) 06:24, 13 March 2024 (UTC)
- @Benwing2: That is weird, I use the standard Urdu keyboard! Is there a difference between the two meem's (in the diffs)? نعم البدل (talk) 07:08, 13 March 2024 (UTC)
- The actual diff where I correct your error is here: [2] I cut and pasted the two strings into Python and the mīm's are the same but your text had a U+2069 character (which is "POP DIRECTIONAL ISOLATE") at the end, directly after the mīm and before the close brace. I don't know how this gets inserted but evidently it does. User:Erutuon you recently made some change concerning RTL embedding vs. isolation so maybe you have some idea what this character is and why it's getting inserted? Benwing2 (talk) 07:46, 13 March 2024 (UTC)
- @Benwing2: I changed the CSS for right-to-left scripts, such as Arabic, from
direction: rtl; unicode-bidi: embed;
todirection: rtl; unicode-bidi: isolate;
, because apparently the "isolate" behavior is newer and recommended by Unicode over the "embed" behavior according to this section of the Unicode Bidirectional Algorithm documentation. I thought that might be causing the characters U+2067 (RIGHT‑TO‑LEFT ISOLATE) and U+2069 (POP DIRECTIONAL ISOLATE) to be inserted before and after the text when you copy and paste it, but it seems not to be the case. Actually User:Theknightwho made Module:script utilities (which is used by link templates) insert those characters in this edit before I made the CSS change. The characters will sometimes show up when you manually select some Arabic-tagged text from a link, copy it, and then paste it into the edit box. If the syntax highlighter (mw:Extension:CodeMirror) is enabled in the edit box, it shows them as red circles and makes them deletable. Anyway, I think Module:script utilities doesn't need to insert these characters anymore, because the CSS has the same effect as them. — Eru·tuon 00:10, 14 March 2024 (UTC)- @Erutuon Thanks! User:Theknightwho, what do you think? Can we try getting rid of the code that inserts those characters? They're definitely showing up in places they shouldn't be, presumably as a result of this. Benwing2 (talk) 02:02, 14 March 2024 (UTC)
- @Benwing2 @Erutuon Yes - that sounds good. Theknightwho (talk) 02:22, 14 March 2024 (UTC)
- @Benwing2, Theknightwho, Erutuon: That also explains why declensions are being messed up. I was confused , because over the past couple of months the declensions were just tripping. I noticed the declensions at نُکَّر (nukkar) (Template:pnb-noun-f-c) not being returned with their transliterations, and Unicode identifier shows that POP DIRECTIONAL ISOLATE is being inserted for whatever reason. نعم البدل (talk) 03:00, 14 March 2024 (UTC)
- And obv no changes were made to the templates. It was driving me a little crazy. نعم البدل (talk) 03:03, 14 March 2024 (UTC)
- @نعم البدل Yeah makes sense. I removed the extra formatting char from نُکَّر (nukkar); hopefully this won't be an issue going forward. Benwing2 (talk) 03:06, 14 March 2024 (UTC)
- Yes, thank you! نعم البدل (talk) 03:07, 14 March 2024 (UTC)
- @نعم البدل Yeah makes sense. I removed the extra formatting char from نُکَّر (nukkar); hopefully this won't be an issue going forward. Benwing2 (talk) 03:06, 14 March 2024 (UTC)
- And obv no changes were made to the templates. It was driving me a little crazy. نعم البدل (talk) 03:03, 14 March 2024 (UTC)
- @Benwing2, Theknightwho, Erutuon: That also explains why declensions are being messed up. I was confused , because over the past couple of months the declensions were just tripping. I noticed the declensions at نُکَّر (nukkar) (Template:pnb-noun-f-c) not being returned with their transliterations, and Unicode identifier shows that POP DIRECTIONAL ISOLATE is being inserted for whatever reason. نعم البدل (talk) 03:00, 14 March 2024 (UTC)
- @Benwing2, Theknightwho: Probably yes, as CSS/HTML code for bidirectional control does not insert them and no relevant web-browser either when copying?
- Blacklisted characters though can also come from people copying from Digital Dictionaries of South Asia, where the Unicode is often random instead of according to current internet usage. But indeed I long know it to come mostly from linking templates, particularly the ones in translation tables.
- As you’ll are on optimizing BiDi behaviour and now consider to remove a part of the pertinent code, for entropy I remark that there were parts of the dictionary that were displayed incorrectly or dubiously until recently, but I observe are now fixed perhaps by Theknightwho’s edit this month: reference templates where all fields are RTL script, and
{{+preo}}
and related templates (for which reason we pragmatically disabled transcription in it half a decade ago, but now transcription is back and correctly ordered e.g. on بَعُدَ (baʕuda)). - Very rarely only (a low three-digit number of quotes at most in the last 8 years, of my 60,000 edits) I have inserted BiDi characters manually (their situated on the 4th level on the default Arabic xkeyboard-config layout), between the author and title field of such references, as at the end of the author field of
{{R:ar:Abu Qalam:2016}}
still present and شَنْدْقُورَة (šandqūra) formerly, Wingerbot removed it in 2020 and now I don’t see a difference any longer. Usually I circumvented the problem by adding some transcription or translation in Latin letters (which is not possible if I don’t know from somewhere how somebody’s name vocalizes …). Fay Freak (talk) 02:48, 14 March 2024 (UTC)- @Fay Freak Yeah I have done periodic runs to remove L2R (U+200E) and R2L (U+200F) marks. The code that does this runs the BIDI algorithm on the Wikitext to see if the result is any different without the directional chars and only removes them if not. Recently however, I made the code more aggressive so it also removes such marks (a) when more than one occurs in a row (leaving only one), (b) whenever they occur at the end of a template argument or link, based on the observation that it doesn't (or didn't?) make any difference in the actual output whether such marks are present. After the last run I also did a manual postprocessing step checking the ones that were still present and removing them when it didn't make a difference in the output or actually improved things (e.g. sometimes there was such a mark between two words in an Arabic translation, which caused the words to show in the wrong order). I think the reason such marks kept occurring in the Wikitext was people cutting and pasting Wiktionary output; now that we've removed the code that inserts the directional chars, this might not be an issue in the future. Benwing2 (talk) 02:58, 14 March 2024 (UTC)
- @Benwing2: Thanks, this sounds even more meticulous than I have expected.
- Your removal of the BiDi character addition in Module:script utilities while I was writing does not cause issues at least in the mentioned two test cases, a good sign, implying that the hypothesis that the same visual result can be reached with CSS alone is correct everywhere. Fay Freak (talk) 03:11, 14 March 2024 (UTC)
- @Fay Freak Yeah I have done periodic runs to remove L2R (U+200E) and R2L (U+200F) marks. The code that does this runs the BIDI algorithm on the Wikitext to see if the result is any different without the directional chars and only removes them if not. Recently however, I made the code more aggressive so it also removes such marks (a) when more than one occurs in a row (leaving only one), (b) whenever they occur at the end of a template argument or link, based on the observation that it doesn't (or didn't?) make any difference in the actual output whether such marks are present. After the last run I also did a manual postprocessing step checking the ones that were still present and removing them when it didn't make a difference in the output or actually improved things (e.g. sometimes there was such a mark between two words in an Arabic translation, which caused the words to show in the wrong order). I think the reason such marks kept occurring in the Wikitext was people cutting and pasting Wiktionary output; now that we've removed the code that inserts the directional chars, this might not be an issue in the future. Benwing2 (talk) 02:58, 14 March 2024 (UTC)
- @Benwing2 @Erutuon Yes - that sounds good. Theknightwho (talk) 02:22, 14 March 2024 (UTC)
- @Erutuon Thanks! User:Theknightwho, what do you think? Can we try getting rid of the code that inserts those characters? They're definitely showing up in places they shouldn't be, presumably as a result of this. Benwing2 (talk) 02:02, 14 March 2024 (UTC)
- @Benwing2: I changed the CSS for right-to-left scripts, such as Arabic, from
- The actual diff where I correct your error is here: [2] I cut and pasted the two strings into Python and the mīm's are the same but your text had a U+2069 character (which is "POP DIRECTIONAL ISOLATE") at the end, directly after the mīm and before the close brace. I don't know how this gets inserted but evidently it does. User:Erutuon you recently made some change concerning RTL embedding vs. isolation so maybe you have some idea what this character is and why it's getting inserted? Benwing2 (talk) 07:46, 13 March 2024 (UTC)
provinces of Pakistan, etc. in Module:labels/data/regional
[edit]I am trying to clean out the junk from Module:labels/data/regional and move it to language-specific modules. However, a year and some ago you added a whole bunch of Pakistan-related stuff (e.g. 'Khyber Pakhtunwa', 'Islamabad', 'Gilgit-Baltistan', etc.), which remains there because it's not associated with any languages or categories. What language(s) did you intend these to be used with? In general we don't want random toponyms there, but only ones that are associated with actual dialects of some language. Benwing2 (talk) 05:02, 10 April 2024 (UTC)
- @Benwing2: Hi! I believe I was just generally trying to add Pakistani provinces to the regional labels. I had an issue a while back where the Kashmir and Punjab label would only be subordinate to India but not Pakistan, but I wasn't able to solve it. نعم البدل (talk) 02:22, 11 April 2024 (UTC)
- I see. In that case since these labels aren't used anywhere, I may just remove them. The alternative is to add them to some language-specific data module but as there is more than one language spoken in Pakistan I'm not sure which module(s) that would be. BTW I'm sure I can help you with the issue with Kashmir and Punjab issue if you can give me some examples of what you were trying to do. Benwing2 (talk) 02:53, 11 April 2024 (UTC)
- @Benwing2: Feel free to remove them. I should have realised they were empty labels. I specify the regional labels for each language now (see User:نعم_البدل/bookmarks#Modules). As for the Kashmir and Punjab issue, I'm not really sure how it should be dealt with, but essentially they should cover both Pakistan and India (where at the minute they're both subordinate to India). This is what I proposed at User_talk:Chuck_Entz/2023#Categories_Punjab,_Jammu_&_Kashmir_&_Ladakh and even tried implementing it but I don't think I done it correctly so, again, feel free to sort it out whichever way you feel is best. The reason why I added the Pakistan province labels, was because I was trying to imitate the way Indian states/labels are handled on this site, but I clearly I couldn't figure it out haha! نعم البدل (talk) 03:03, 11 April 2024 (UTC)
- I see what you mean. Yeah IMO all the names of individual states, provinces and such should specify the associated country in them. We do that with some countries (e.g. it's Category:Arizona, USA not Category:Arizona) but not with others. If we did this here there wouldn't be the current issue with Category:Punjab referring to the Indian state rather than the Pakistani province. I'm going to propose renaming these in the Beer Parlour. Benwing2 (talk) 03:16, 11 April 2024 (UTC)
- @Benwing2: Feel free to remove them. I should have realised they were empty labels. I specify the regional labels for each language now (see User:نعم_البدل/bookmarks#Modules). As for the Kashmir and Punjab issue, I'm not really sure how it should be dealt with, but essentially they should cover both Pakistan and India (where at the minute they're both subordinate to India). This is what I proposed at User_talk:Chuck_Entz/2023#Categories_Punjab,_Jammu_&_Kashmir_&_Ladakh and even tried implementing it but I don't think I done it correctly so, again, feel free to sort it out whichever way you feel is best. The reason why I added the Pakistan province labels, was because I was trying to imitate the way Indian states/labels are handled on this site, but I clearly I couldn't figure it out haha! نعم البدل (talk) 03:03, 11 April 2024 (UTC)
- I see. In that case since these labels aren't used anywhere, I may just remove them. The alternative is to add them to some language-specific data module but as there is more than one language spoken in Pakistan I'm not sure which module(s) that would be. BTW I'm sure I can help you with the issue with Kashmir and Punjab issue if you can give me some examples of what you were trying to do. Benwing2 (talk) 02:53, 11 April 2024 (UTC)
So many Urdu entries that are actually Sanskrit words.
[edit]What can we do about them? It's kind of out of control. Are there any Urdu admins we can ping? 178.120.10.176 16:45, 10 June 2024 (UTC)
Middle Hindi
[edit]Ok so the range for Old Hindi is 10th–13th centuries. But I can't find anything on wiki regarding "Middle Hindi". The template currently links to nothing. Could you self-revert? 178.120.10.176 20:57, 10 June 2024 (UTC)
- I can't seem to find this Middle Hindi thing, even in scholar.google.com. Could you share your sources? 178.120.10.176 21:01, 10 June 2024 (UTC)
- @178.120.10.176: – It's not necessary for everything to be on Wikipedia. Middle Hindi is an etymology-based language enabled on Wiktionary and is used for pin-pointing the stages of which Urdu lemmas have been attested (in this case during the 13-17th century). It's not a fully-fledged language. Any lemmas for Middle Hindi are categorised under Old Hindi, like فَرْمان (frman). @Kutchkutch نعم البدل (talk) 01:48, 11 June 2024 (UTC)
- I mean in that case it shouldn't link to Wikipedia. 178.120.10.176 01:54, 11 June 2024 (UTC)
- It just means the article for Middle Hindi doesn't exist on Wikipedia, but that does not mean the Middle Hindi stage doesn't necessarily exist. Middle Hindi is intermediate between Old Hindi and Modern Urdu. نعم البدل (talk) 01:56, 11 June 2024 (UTC)
- Even if Middle Hindi exists, why should it link to an article that doesn't exist. थोड़ा अजीब है ना? 178.120.10.176 02:03, 11 June 2024 (UTC)
- It just means the article for Middle Hindi doesn't exist on Wikipedia, but that does not mean the Middle Hindi stage doesn't necessarily exist. Middle Hindi is intermediate between Old Hindi and Modern Urdu. نعم البدل (talk) 01:56, 11 June 2024 (UTC)
- I mean in that case it shouldn't link to Wikipedia. 178.120.10.176 01:54, 11 June 2024 (UTC)
- @178.120.10.176: – It's not necessary for everything to be on Wikipedia. Middle Hindi is an etymology-based language enabled on Wiktionary and is used for pin-pointing the stages of which Urdu lemmas have been attested (in this case during the 13-17th century). It's not a fully-fledged language. Any lemmas for Middle Hindi are categorised under Old Hindi, like فَرْمان (frman). @Kutchkutch نعم البدل (talk) 01:48, 11 June 2024 (UTC)
@178.120.10.176: Although there may be some flexibility regarding the exact time frame, there are sufficient usages on Google Books and Google Scholar to demonstrate that there exists an entity called “Middle Hindi” outside of Wiktionary (using the quotation marks in the search box). Thus, the term is certainly not unique to Wiktionary. If Wikipedia does not have an article for it, that is not a concern for Wiktionary. What could possibly be a concern for Wiktionary is whether links to Wikipedia should be disabled if there is no corresponding article. If you are interested in starting such a discussion, you could possibly do so at at a different location such as WT:GP and see if anyone responds. See
for more such instances. Kutchkutch (talk) 16:48, 12 June 2024 (UTC)
Pronunciations in Urdu
[edit]I see you have been pronouncing words in Urdu and that is amazing. There is one problem with some of the recordings though. شاعر and محبت for example. āiC is not a legal sequence in Urdu outside -āiś from Persian words and later loans from English, āiC always becomes āyaC even in the most cultivated of pronunciations. Also please respect the allophones of vowels u and a around h. In the standard prestigious pronunciations, which you probably also use in everyday life, it is môhôbbat, not muhabbat or mahabbat. I think these should be corrected. You do not have to use artificially make it sound exactly like the transcription, because in practice phonetic phenomena and processes produce allophonic realisations. RonnieSingh (talk) 13:16, 20 June 2024 (UTC)
- Hi @RonnieSingh:, Thanks for your message. I definitely do agree that I haven't done my recordings as professionally as they should have been and it's something that I hate myself for. There are also some audioclips which I've been removing from Wiktionary because they're just so subpar. The main reason for this was:
- 1. I was slightly unwell at that time (and you might hear the hesitancy in my voice as well), and honestly I was recording clips to pass time.
- 2. As you may know I'm a Punjabi-Urdu speaker, so my Urdu is influenced by Punjabi, and I wasn't sure whether to conceal my Punjabi accent or not and whether I should go for the 'formal' Urdu pronunciation.
- 3. Admittedly, the Punjabi vs Urdu issues had me somewhat confused regarding the short vowels in some words.
- If I had a chance, I would honestly remove all of my audio clips and record them all again, but I recorded somewhere around 250 clips, many of which are being used on Wiktionary already, and I don't think it would be possible for me to redo the recordings, especially with my schedule.
Also please respect the allophones of vowels u and a around h
– So for instance when it comes to محبت, would you prefer I pronounced it as [mʊ.ɦəb.bət̪] or [mo.ɦəb.bət̪]? I pronounced it as [mo.ɦəb.bət̪], perhaps it's not coming off as clear as it should be?You do not have to use artificially make it sound exactly like the transcription, because in practice phonetic phenomena and processes produce allophonic realisations
– Thank you, I will bear this in mind for any future recordings I do, and I'll try to correct some of my previous recordings. نعم البدل (talk) 16:43, 20 June 2024 (UTC)
- /mʊɦəbːət/ [mɔɦɔb.bət], like bahut bôhôt RonnieSingh (talk) 17:29, 20 June 2024 (UTC)
Contact?
[edit]Hey I saw you were the few people who contributed anything to the Urdu lemmas and I'm new to wikitionary so it'd be really nice if I could contact you for some formatting/etiquette advice Syamantak07 (talk) 08:27, 26 July 2024 (UTC)
Some of this looks wrong. Namely the spellings انھوں, تمھیں which imply an aspirated /mʰ/ and /nʰ/ which do not exist in Urdu. It would also make the syllables /tu.mheN/ and /i.nhoN/ which is not how I've ever heard it pronounced.
I've noticed you didn't add it, but it's been making me think that I should edit Module:ur-translit to return nil if ھ is used with a character that can't be aspirated. — BABR・talk 18:30, 5 August 2024 (UTC)
- @Babr: – Thank you for bringing this up. The correct spelling is indeed with the choti-he, not the do-chashme-he. I will fix the respective pages.
but it's been making me think that I should edit Module:ur-translit to return nil if ھ is used with a character that can't be aspirated
– I would probably leave it as it is. While certain compounds might be rare, technically speaking they would be valid, so I don't think it would be wise to limit the do-chashme-he to certain compounds only. نعم البدل (talk) 00:09, 6 August 2024 (UTC)
- @نعم البدل my idea was to just require the transliteration to be manually entered for such cases, since basically all of them would be misspellings AFAIK. But I suppose I'll leave it as-is for the time being. — BABR・talk 07:09, 6 August 2024 (UTC)
Old Hindi vs Old Punjabi
[edit]diff would Baba Farid Ganj works count as "Old Hindi" ?
- My understanding is that Baba Farid‘s works are Old Punjabi based on
- (The template T:R:inc-opa:Farid needs to be updated.)
- Based on that understanding it might be better to reanalyse the Old Hindi دل as
- Is there dictionary/index/glossary that you consulted to find that term, or did you read one of Baba Farid’s works directly at apnaorg.com?
- Although the original works of Baba Farid must be in the Shahmukhi script, they are included in the Guru Granth Sahib in the Gurmukhi script. However, it may not be entirely incorrect to consider Baba Farid’s works as Old Hindi. In some cases it may not be clear which text or term belongs to which language. On Wikipedia, the languages which are difficult to distinguish are collectively known as Sant Bhasha or Sadhukkari.
- Punjabi probably became involved in this mixture of Sant Bhasha and Sadhukkari since Delhi was well-connected connected to Lahore via the Grand Trunk Road. This may be a contributing factor to how Urdu became the lingua franca of Pakistan rather than Punjabi or Pashto. Faridabad (फ़रीदाबाद), which is in the Delhi National Capital Region, is even named after Baba Farid.
- Languages other than just Hindi-Urdu and Punjabi may have to taken into consideration to distinguish which text or term belongs to which language. This is because khaṛī-bolī Hindi-Urdu was originally native to Uttar Pradesh and Delhi before becoming a lingua franca of Punjab.
- At Kauravi dialect it says
Khari Boli dialect came to be regarded as urbane and of a higher standard than the other surrounding languages … over the 19th century; before that period, other languages such as Awadhi, Braj Bhasha, and Sadhukaddi were preferred by littérateurs. Standard Hindustani first developed with the migration of Persian Khari Boli speakers from Delhi to the Awadh region—most notably Amir Khusro … Before the rise of Khariboli, the languages adopted by the Bhakti saints: Braj Bhasha (Krishna devotees), Awadhi (adopted by the Rama devotees) … after the Bhakti movement became ritualistic, these languages came to be regarded as rural and unrefined …The Persian-influenced Khariboli thus gradually came to be regarded as a prestige dialect, although hardly any literary works had been written in Khariboli before the British period … European administrators in India and the Christian missionaries played an important role in the creation and promotion of the Khariboli-based modern Hindustani
- At a minimum, the other languages to consider in Sant Bhasha and Sadhukkari would be Braj and Awadhi. Braj and Awadhi are non-khaṛī-bolī Hindi languages that are also native to Uttar Pradesh.
- At Kabir it says
- Kabir's poems were in Sadhukkadi, also known as Panchmel Khichri, borrowing from various dialects including Khadi boli, Braj, Bhojpuri, and Awadhi
- At Awadhi_language it says
- Chaturvedi has shown that the same pada may be found with more characteristic Avadhi forms in the Bijak, with more Khari-boli in the Guru Granth and with Braj forms in the Kabir Granthavali
Khari-boli in the Guru Granth
suggests that Old Punjabi is not the only language in the Guru Granth Sahib. This is why the translation of the quotation at Old Hindi नांती (nāṃtī) is actually the translation of the equivalent verse in Guru Granth Sahib. Some Gurmukhi Old Punjabi dictionaries seem to exclude khaṛī-bolī terms in the Guru Granth Sahib to distinguish the Old Punjabi terms from the terms in other languages. Kutchkutch (talk) 09:41, 18 August 2024 (UTC)
- @Kutchkutch:
Is there dictionary/index/glossary that you consulted to find that term, or did you read one of Baba Farid’s works directly at apnaorg.com?
– So the reason why I mentioned this is because UDB actually used a quote from Baba Fareed Ganj Shakar and dated it around 1265. While I'm, too, of the opinion that his texts should mainly be used for Old Punjabi, I wanted a second opinion on this.- The book they've quoted is called urdū kī ibtidāī naśo o namā mẽ sūfiyāe karām kā kām, ie. The role of Sufi [elders] in the early growth and development of Urdu.
However, it may not be entirely incorrect to consider Baba Farid’s works as Old Hindi
– as I said, I agree with this. Ignoring BFGS works, the earliest attestation would then be 1421 (for the word دِل (dil)).
- نعم البدل (talk) 19:20, 18 August 2024 (UTC)
I wanted a second opinion on this.
- I appreciate that you pinged me for another opinion. And, thanks for letting me know that this term from Baba Farid’s work is being treated as Old Hindi.
The book they've quoted is called …
- I initially did look at
{{R:ur:UDB}}
, but did I not notice it was there until you replied. Now that you have mentioned it, I can see where it is on the page. Thanks for the translation and transliteration of the book that has been quoted.
- I initially did look at
as I said, I agree with this.
- Sorry if the Wikipedia-like content after the first few paragraphs was/is a bit distracting. Since
{{R:ur:UDB}}
has mentioned the Baba Farid quotation, the term can perhaps remain as Old Hindi for now. Kutchkutch (talk) 14:53, 20 August 2024 (UTC)
- Sorry if the Wikipedia-like content after the first few paragraphs was/is a bit distracting. Since
Punjabi pronunciation
[edit]- Discussion moved from Talk:ਅੰਨ੍ਹਾ#Transliteration.
I had been adding the 'Phonetic' sections under the Pronunciation header specifically for Punjabi lemmas.
- Yes, I have noticed some entries have this while other entries do not. This inconsistency makes it unclear whether they are acceptable or not. The unstandard aspect about them is that they are untemplatised with the syntax being
{{w|Punjabi_language#Phonology|Phonetic}}
It was basically there for those who didn't understand IPA, and unified both scripts.
- These phonetic transcriptions occur very frequently Gurmukhi Punjabi learning materials. I do find them useful even with an intermediate understanding of the IPA.
- I have never thought of them as a replacement for the IPA. Instead, they add the tonal information to the orthographic transliteration, which is just another way of saying
ISO 15919 + tones
.
- Since phonemic tone does not exist in Indo-Ayan languages outside the greater Punjab and Bengal regions, this intermediary between the transliteration and IPA could possibly help readers better understand terms with phonemic tones.
- They are like transcriptions used in the
|ts=
parameter. Other tonal languages such as Thai do show alternative romanisation schemes in entries. Therefore, I am not entirely opposed to showing them on Punjabi entries if there is a way to standardise them perhaps with Module:pa-IPA.
would anyone be kind enough in helping develop Module:pa-IPA
- AryamanA was interested in developing Module:mr-IPA several years ago. I helped him on the underlying phonology who and adding testcases. He was able to develop it to the point of being able to deploy it on entries.
- Although I am not as competent at coding as I would like, I would certainly like to explore if I can help with the development this at some point. Kutchkutch (talk) 03:05, 26 August 2024 (UTC)
- @Kutchkutch:
This inconsistency makes it unclear whether they are acceptable or not.
– The reason for this was, I only added them for lemmas which had a tonal pitch. For the others, I didn't see a need, since the transliteration sufficed. The issue is, as you may know, the Punjabi scripts don't represent the phonology / pronunciation of the words too well. Neither Shahmukhi nor Gurmukhi script explicitly indicate tones, and often there are multiple ways of writing a word, though the pronunciation is the same, or there are inconsistencies in the Gurmukhi and Shahmukhi spellings, and that can sometimes cause a confusion in the pronunciation.I have never thought of them as a replacement for the IPA
– Of course.Other tonal languages such as Thai do show alternative romanisation schemes in entries. Therefore, I am not entirely opposed to showing them on Punjabi entries if there is a way to standardise them perhaps with Module:pa-IPA.
– Yes, the Thai and Chinese romanisation schemes did influence me to start something for Punjab lemmas. I would love to work on a Module:IPA-pa, if you have time. Perhaps it's something @عُثمان may also want to get involved with. نعم البدل (talk) 07:44, 26 August 2024 (UTC)
- @Kutchkutch @نعم البدل It would be challening, but I would be interested in helping with an IPA generator. There is some outstanding information regarding syllabification I would need to work out first عُثمان (talk) 12:01, 26 August 2024 (UTC)
I only added them for lemmas which had a tonal pitch. For the others, I didn't see a need, since the transliteration sufficed.
- I have observed that not all terms have tonal pitch, so only the orthographic transliteration would be suffice in such cases. Thanks for confirming this observation.
Neither Shahmukhi nor Gurmukhi script explicitly indicate tones, and often there are multiple ways of writing a word, though the pronunciation is the same
- This is what I have for the origin of tones:
- In the beginning there were probably no tones … Slowly … tones have evolved. With the evolution of tones sometimes the structure of the writing system remains the same but the correspondence of what is written and how it is to be pronounced keeps on changing.
- This is what I have for ਘ ਝ ਢ ਧ ਭ
- ਘ ਝ ਢ ਧ ਭ when appearing at the beginning of a syllable, carry the low tone
- ਧ: ਧੋਬੀ tòbī
- ਭ: ਭਾਈ / بھائی pā̀ī
- ਝ: ਝਾੜੂ / جھاڑُو cā̀ṛū
- ਘ ਝ ਢ ਧ ਭ when appearing at the beginning of a syllable, carry the low tone
- ਘ ਝ ਢ ਧ ਭ when appearing at the end of a word, carry the high tone
- ਧ: ਕੰਧ / کَنْدھ kánd wall
- ਧ: ਦੁੱਧ / دُدّھ dúd milk
- This is what I have for ਹ
- ਹ does not involve any tonal use in the initial position in a syllable. ਹ marks high tone when not word-initial
- ਚਾਹੀਦਾ / چاہِیدا cā́īdā
- ਵਾਰ / وار vār turn vs ਵਾਹਰ / واہَر vā́r crowd
- ਪੀ / پِی pī drink vs ਪੀਹ pī́ grind
- ਲੋ / لو lo light vs ਲੋਹ ló griddle
- ਮੋਰ / مور mor peacock vs ਮੋਹਰ mór seal
- ਹ in the word-final position is replaced by a high tone with no phonetic transformation of vowel sounds.
- ਚਾਹ / چاہ cā́
- ਇ (i) + ਹ (h) represents é
- ਕਿਰਾ vs ਕਿਹੜਾ / کِہڑا {kihṛā} kéṛā
- ਉ (u) + ਹ (h) represents ó
- ਕੁਰ vs ਕੁਹੜਾ / کُہڑا {kuhṛā} kóṛā
- ਹ (h) + ਇ (i) represents ɛ́
- ਕਹਿ {kahi} kɛ́
- ਹ (h) + ਉ (u) represents ɔ́
- ਕਹੁ {kahu} kɔ́
- ਹ does not involve any tonal use in the initial position in a syllable. ਹ marks high tone when not word-initial
- For Shahmukhi,
- In Panjabi, ہ choṭī he and ح baṛī he can also indicate high or low tone, depending on their position in a syllable, as in کھیہ khé ‘dust’, or آرام دہ ārām dé ‘restful’, where ہ choṭī he indicates high tone, or in اصلاح islā́ ‘reformation, correction’ where ح can indicate high tone for some speakers.
- This is what I have for the origin of tones:
the Thai and Chinese romanisation schemes did influence me to start something for Punjab lemmas
- I have also noticed Chinese, but I initially didn’t mention it because Wiktionary’s treatment of it is very different compared to other languages. The other aspect of Chinese that could be applicable to Punjabi is
{{dialect synonyms}}
.
- I have also noticed Chinese, but I initially didn’t mention it because Wiktionary’s treatment of it is very different compared to other languages. The other aspect of Chinese that could be applicable to Punjabi is
- @عُثمان
There is some outstanding information regarding syllabification
- The syllabication of Marathi doesn’t always work well even it’s current state, especially with respect to ळ. So, the output just needs to be usable even if isn’t perfect. Kutchkutch (talk) 13:35, 26 August 2024 (UTC)
- @Kutchkutch This is all accurate, although there are some additional details which may be noted:
- There are a small number of words in which voiced stops are retained alongside the tone. These are spelled using a subscript ਹ, for example ਜਗ੍ਹਾ rather than ਜਘਾ and ਜਬ੍ਹਾ rather than ਜਭਾ. Likewise there are consonants without voiced counterparts which occur in small numbers of words with subscript ਹ: ਸ੍ਹ ਣ੍ਹ ਨ੍ਹ ਮ੍ਹ ਯ੍ਹ ਰ੍ਹ ਲ੍ਹ ਵ੍ਹ ੜ੍ਹ ਲ਼੍ਹ. The tone rules are the same as for ਘ ਝ ਢ ਧ ਭ.
- Tonal ਹ does occur word initially in Pothohari and related dialects. ਹਲਦੀ is pronounced àḷdī in that dialect for example.
- The pronunciation of intervocal ਹ has some lexical variation. In ਆਹੋ it is always a consonant with no tone. In ਮਹੀਨਾ it is tonal in most dialects but pronounced with h in (Bist) Doabi.
- عُثمان (talk) 14:00, 26 August 2024 (UTC)
- @Kutchkutch:
This is all accurate
- Thanks for the corroboration.
There are some additional details which may be noted
- The outline that provided may not be complete, so thanks for the additional details. The details are what help create testcases, and the testcases set an expectation for the output of a module.
There are a small number of words in which voiced stops are retained alongside the tone. These are spelled using a subscript ਹ … Likewise there are consonants without voiced counterparts which occur in small numbers of words with subscript
- This would explain the term ਅੰਨ੍ਹਾ, which was how this discussion started in the first place.
Tonal ਹ does occur word initially in Pothohari and related dialects … ਮਹੀਨਾ it is tonal in most dialects but pronounced with h in (Bist) Doabi
- This information other dialects is definitely very helpful and interesting. Whether Module:pa-IPA needs to account for it depends on whether its scope extends beyond the standard Majhi Punjabi. Kutchkutch (talk) 15:50, 26 August 2024 (UTC)
- @Kutchkutch
This information other dialects is definitely very helpful and interesting. Whether Module:pa-IPA needs to account for it depends on whether its scope extends beyond the standard Majhi Punjabi.
– My humble opinion is that we ignore other dialects for now and just aim to get something going for Standard Punjabi. Additionally, instead of interpreting the IPA from the native scripts, IMO it's better to require a parameter for the module, similar to the phonetic transcription I was adding previously. نعم البدل (talk) 16:03, 26 August 2024 (UTC)
instead of interpreting the IPA from the native scripts, IMO it's better to require a parameter for the module, similar to the phonetic transcription
- Definitely. This is how most Indo-Aryan IPA modules such as MOD:hi-IPA, MOD:mr-IPA, etc. work. Although the native script can be used as an input to these modules, this is just an additional feature. These algorithm of these IPA modules still convert the romanisation to IPA rather than convert the native script to IPA. Since the Dravidian IPA pronunciation modules convert the native script to IPA, consulting them as a model may not be helpful.
- I included the native scripts just to provide some context for the examples. Similar to بُحْران, the Shahmukhi spelling of ਮੋਹਰ seems to represent /o/ as اُ rather than as و. This is perhaps an example of why the native scripts should not be used as an input.
- A pronunciation guide for Punjabi even says:
- Writing systems are very deceptive. Native speakers don’t read every word like a foreign language because they already know the words, so when they do read they don’t read letter by letter.
- It is the spoken part of Punjabi that is more important than the written part.
- The alphabets whether they are English or Punjabi are only signs. They just help you to go further. Generally they are quite an impediment. Kutchkutch (talk) 17:36, 26 August 2024 (UTC)
- @Kutchkutch:
This is how most Indo-Aryan IPA modules such as MOD:hi-IPA, MOD:mr-IPA, etc. work
– Yeah, I've seen the modules. The difference is Hindi and Marathi phonology are quite easy to interpret unlike Punjabi scripts and it's quite convenient that Module:hi-IPA is based on Hindi transliteration.This is perhaps an example of why the native scripts should not be used as an input.
– It would eventually be supported but I feel that for something similar to work for Punjabi, you'd need a separate module for the IPA, and a separate module for transliteration to IPA. The latter can come at a later point.- Now there is also the issue of Phonemic vs Phonetic transcriptions. My personal preference would be that it should be able to display both, but since we can't generate the IPA transcriptions based on the scripts, it would only be the phonetic transcriptions that would be generated (ie. bā́jō̃ -> [bäː˦.d͡ʒõː]), plus the issue of half gemination (I'm not sure if you've experienced this in Punjabi @عُثمان). نعم البدل (talk) 17:51, 26 August 2024 (UTC)
since we can't generate the IPA transcriptions based on the scripts, it would only be the [phonemic? //] transcriptions that would be generated
- For converting phonemic transcriptions // to phonetic transcriptions [], there may either be
- a need for a separate module
- or an additional functionality built-in to the existing module
- Perhaps this why phonetic transcriptions [] should be handled at a later point.
- Although Module:mr-IPA accounts for some phonetic processes, this is another aspect of that module that has errors (in addition to syllabication of ळ).
plus the issue of half gemination (I'm not sure if you've experienced this in Punjabi
- I’m guessing that this refers to the following phonetic rule for unetymological germination.
- ∅ → ˑ / (ω)Vː.C_ρ
- The statement at Punjabi language says
- There is a tendency to irregularly geminate consonants which follow long vowels, except in the final syllable of a word such as ਮੈਨੂੰ / مَینُوں
- ∅ → ˑ / ω1ɛː.n_ρ2
- There is a tendency to irregularly geminate consonants which follow long vowels, except in the final syllable of a word such as ਮੈਨੂੰ / مَینُوں
- This would also apply to the word ਪੰਜਾਬੀ / پَنْجابِی as
- ∅ → ˑ / σ1.ω2äː.b_ρ3
- and perhaps to ਰੋਟੀ / روٹِی as
- ∅ → ˑ / ω1oː.ʈ_ρ2
- Kutchkutch (talk) 04:02, 27 August 2024 (UTC)
- I’m guessing that this refers to the following phonetic rule for unetymological germination.
پھرنا
[edit]- Usually, I just make minor edits or add links to Shahmukhi Punjabi terms (or Punjabi terms in general) and don’t make edits like this. However, editing/creating (learning about) Shahmukhi Punjabi entries is something I am interested in. So, thanks for the feedback.
- Regarding
{{R:pa:CLE}}
,
- I wanted to use
{{R:pa:CLE}}
, but it currently this just links to the main page.
- I wanted to use
- I found out that individual entries have IDs in the URL format
202.142.159.36:8081/opd/WordFinder.aspx?wordid=31550
- Regarding
{{pnb-conj-v}}
,
- Since the verb stem of پھرنا ends in ر, wouldn’t ݨ be replaced by ن? My understanding is that the infinitive marker for verb stems ending in [ر ڑ ڑھ ݨ لؕ=ࣇ] is نا rather than ݨا.
- If so, then some additional template code may need to be added to
{{pnb-conj-v}}
and its helper templates
{{pnb-conj-imps}}
{{pnb-conj-no-aspect}}
- If so, then some additional template code may need to be added to
- to reflect this.
Kutchkutch (talk) 09:16, 6 September 2024 (UTC)
- @Kutchkutch:
individual entries have IDs in the URL format
– I tried adding the individual ID's as a parameter in the{{R:pa:CLE}}
template, but I found that the individual ID's aren't permanent. They're temporary, and eventually when they expire, the user is returned with an error, unfortunately. However, CLE's dictionary is pretty much (if not the exact) same as Salah-ud-Din Iqbal's dictionary. The definitions of words that I've seen so far, were the exact same.Regarding
– This is quite a basic template and certainly needs to be worked on, so feel free to even completely rewrite it. I only use it on basic verbs. Ideally, either Module:pnb-verb needs to be made (or converted), or code to support verbs in Shahmukhi needs to be added to Module:pa-verb.{{pnb-conj-v}}
Since the verb stem of پھرنا ends in ر, ... My understanding is that the infinitive marker for verb stems ending in approximate/nasal laterals
– As a general rule of thumb, yes, that is the case – I'm also assuming that's considered "Standard", though it's not always the case. I've heard پِھرݨا (phirṇā) as well (although I would keep it as پِھرنا (phirnā). نعم البدل (talk) 14:48, 6 September 2024 (UTC)
I found that the individual ID's aren't permanent
- I have observed that the pages do timeout, but I didn’t realise that the individual IDs are temporary as well. In that case, if the reference template is to be used, just having a link to the main page is the best that can be done.
{{R:sd:CLE}}
doesn't seem to have this issue, which is why I must have assumed that the individual IDs for{{R:pa:CLE}}
would stay permanent.
CLE's dictionary is pretty much (if not the exact) same as Salah-ud-Din Iqbal's dictionary.
- Yes, I noticed that. However, even if the information across different reference templates is the same, the different interface may still be useful. For example,
{{R:sd:CLE}}
is almost the same as{{R:sd:Mewaram}}
- In addition to its own information,
{{R:ur:Rekhta}}
is a consolidation of various other dictionaries.
- Yes, I noticed that. However, even if the information across different reference templates is the same, the different interface may still be useful. For example,
Regarding
{{pnb-conj-v}}
– This is quite a basic template and certainly needs to be worked on, so feel free to even completely rewrite it.
- Thanks for the openness. However, I’d be hesitant to making any major changes, especially since I don’t consider myself as a proficient coder. And, I wouldn’t want to break the existing functionality for the sake for testing something new.
- Having said that said, regarding
the infinitive marker for verb stems ending in approximate/nasal laterals
- Perhaps the string functions in MOD:string can be invoked to replace ݨ with ن after [ر ڑ ڑھ ݨ لؕ=ࣇ] without having to use a module such as MOD:pnb-verb. For example, the usage for the substring function is
- {{#invoke:string|sub|target_string|start_index|end_index}}
- Perhaps the string functions in MOD:string can be invoked to replace ݨ with ن after [ر ڑ ڑھ ݨ لؕ=ࣇ] without having to use a module such as MOD:pnb-verb. For example, the usage for the substring function is
- Regarding the naming of templates and modules,
- Since
pnb
has been removed as a language code for Shahmukhi Punjabi, would instances ofpnb
be replaced withpa-Arab
? Or would that not be justified without simultaneously replacing Gurmukhi-only instances ofpa
withpa-Guru
?
- Since
- Kutchkutch (talk) 06:38, 8 September 2024 (UTC)
- Regarding
{{R:pa:Bashir}}
,- Thanks for making the template. I have noticed that dictionary at DDSA before. As can be seen at the entry for
- گھر 【گَھر】 ghar {kàr}
- Its formatting is quite impressive, especially since
- It has the spelling with and without the diacritics zabar, zer, pesh, tashdid, etc.
- It has orthographic transliteration plus the tonal phonetic transliteration for applicable terms.
- The front matter is very informative in terms of grammar.
- The example sentences are helpful. Perhaps they can be used in entries with a citation.
- However, I didn’t consider making a template for it because it doesn’t seem comprehensive. Many of the words I have making descendants trees for don’t seem to have entries in it. It says in the front matter:
- The present work is the first edition of the continuing series of the Punjabi-English Dictionary.
- After looking at this dictionary, the same publisher’s Pashto dictionary
{{R:ps:Pashtoon}}
seems to be equally impressive for finding borrowings from/into Pashto. - Coincidentally, one the author’s given name Kanval کَن٘وَل is in the descendants tree that I recently expanded at 𑆑𑆩𑆬 (kamala). However, a common issue that I’ve faced is that even though the descendants tree for that term has been expanded, deciding how to make the entry for کَن٘وَل or قَول is another challenge.
- Regarding CAT:Punjabic languages,
- I recently renamed that from CAT:Punjabi-Lahnda languages based on
- Existing pieces of 'Lahnda' should be integrated with Punjabi at User_talk:Kutchkutch#Saraiki_–_Old_Punjabi
- and a recent discussion with User:Svartava on the Wiktionary South Asia Discord server (that you’re welcome to join as User:Svartava said at Special:Diff/81063884).
- Kutchkutch (talk) 03:49, 11 September 2024 (UTC)
- @Kutchkutch:
I didn’t consider making a template for it because it doesn’t seem comprehensive
– I would definitely agree with you here, and while I appreciate the phonetic transliteration, the method of transliteration seems quite informal. Nevertheless, I do like it because, though it's not as comprehensive as Iqbal's VPL, it does contain a lot of information for each word, and can be a good add for the 'Further reading' section. It also categorically mentions Saraiki and Pothwari terms, and I've not seen any proper digital dictionaries for those vernaculars, and Iqbal fails to highlight dialectal usage in his dictionary.The example sentences are helpful. Perhaps they can be used in entries with a citation.
– Personally, I wouldn't add them as quotes on WT, unless the example sentences were taken from the Punjabi newspapers, and can be retrieved online or verified. I like to add quotes to show the word or sense in actual usage.I recently renamed that from CAT:Punjabi-Lahnda languages based on
– Thank you, that should suffice. نعم البدل (talk) 04:14, 11 September 2024 (UTC)
- This craze about discord, how can it be that I get left behind :) نعم البدل (talk) 04:26, 11 September 2024 (UTC)
- Regarding