Commons:Village pump/Technical
Village pump/Technical |
Bug reports |
Code review |
Tools |
Tools/Directory |
Idea Lab |
This page is used for technical questions relating to the tools, gadgets, or other technical issues about Commons; it is distinguished from the main Village pump, which handles community-wide discussion of all kinds. The page may also be used to advertise significant discussions taking place elsewhere, such as on the talk page of a Commons policy. Recent sections with no replies for 30 days and sections tagged with {{Section resolved|1=--~~~~}} may be archived; for old discussions, see the archives; recent archives: /Archive/2024/08 /Archive/2024/09.
- Feature or bug reports should be filed on Phabricator (see how to report a bug). Bugs with security implications should be reported differently (see how to report security bugs).
- Have you read the FAQ?
SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 1 day and sections whose most recent comment is older than 30 days. | |
OCR to auto-categorize maps / charts by year shown
editIs there any gadget/tool for optical character recognition (OCR) of files on Wikimedia Commons?
If there is no such thing it would be really great if somebody could give it a try, it could be very useful.
I'd like to categorize Our World in Data maps by the year of the data into Category:Maps of the world by year as well as OWID charts by the latest data point into Category:Charts by year of latest data.
This is useful for many reasons such as making things in the image explicit as metadata, making things queryable (for example combining cats using petscan), statistics, search (see the search box), better enabling people to find the latest version for some data, better WMC search engine results, and (probably most importantly) updating outdated/old datagraphics that are in use (GLAMorgan can be used for that).
The issue there is that there are really many OWID files (which should now all be in the OWID category) and there may be even far more once people upload "image stacks" for the OWID Gadget if that is the way used to display more interactive OWID data (which I oppose as suboptimal).
- Here is the petscan query for OWID maps with unspecified year (552 items)
- Here is the petscan query for OWID charts with unspecified year of latest data (2704 items)
One could go through the former manually which also has the advantage that many of these are missing one or a few other categories but the second one really has too many items to do that manually and again more OWID datagraphics keep getting uploaded and this isn't only about OWID datagraphics (there's also other cats one could scan).
See also my related comment here that is about machine vision on WMC more generally or automated species identification: …open letter…#Image recognition software for categorisers.
In my example usecase, an OCR Commons tool could for example OCR read all numbers in a file (files of the petscan results) and then (if it found one or a plausible one) set the category for the latest year that is ≤ current year
. Prototyperspective (talk) 11:43, 19 July 2024 (UTC)
- For Category:Images by text that could be helpful too. Ideally one could choose
- a word, group of words, or category tree
- define a maximum number of words or characters that should be on an image (sample: less than 5 words). This to avoid doing OCR on lengthy texts.
- Then confirm suggestions made by OCR. Enhancing999 (talk) 12:21, 19 July 2024 (UTC)
- I do not know about gadgets.
- There is an OCR tool.
- See https://ocr.wmcloud.org/ for direct interface and API documentation.
- It will work with PNG files but not SVG files (which can be converted to PNG and then OCR'd).
- One can get the URL for a PNG rendering of an SVG file. Here's a conversion that is 887 pixels wide
- {{filepath:Tulejki zaciskowe.svg|887}} →
- https://upload.wikimedia.org/wikipedia/commons/thumb/a/a5/Tulejki_zaciskowe.svg/887px-Tulejki_zaciskowe.svg.png
- Here's a Polish OCR run on that PNG:
- [https://tools.wmflabs.org/ws-google-ocr/api.php?image={{filepath:Tulejki zaciskowe.svg|887}}&lang=pl Tulejki zaciskowe.svg] →
- Tulejki zaciskowe.svg →
- https://ocr.wmcloud.org/api.php?engine=google&image=https://upload.wikimedia.org/wikipedia/commons/thumb/a/a5/Tulejki_zaciskowe.svg/887px-Tulejki_zaciskowe.svg.png&lang=pl →
{"engine":"google","langs":["pl"],"psm":3,"crop":[],"image_hosts":["upload.wikimedia.org","upload.wikimedia.beta.wmflabs.org"],"text":"Typ \u015bci\u0105gaj\u0105cy\nTyp naciskaj\u0105cy\nTyp obustronny"}
- So the Polish text is (converting Unicode code points to Unicode)
- Typ ściągający
- Typ naciskający
- Typ obustronny
- But why OCR an SVG file? The PetScan query shows SVG files that have
text
elements. - With JavaScript, read the SVG file with the Fetch API, grab the
text
elements withgetElementsByTagNS(nsSVG, "text")
, ask for the.textContent
of eachtext
element, and then search that string for the years or terms you want. - I do not know about the rest of the task.
- Glrx (talk) 14:57, 19 July 2024 (UTC)
- But why OCR an SVG file? The PetScan query shows SVG files that have
- Wow great so around 70% of this already exists! Thanks a lot for this info. Now it basically only needs a way to make it scan files in petscan results.
- SVG files always have a PNG file linked beneath them so they don't need to be converted again.
- However, SVG files already have the text as plain text in them so rather than OCRing them it would be better if they the text contained in them was read somehow. However, that (which you also described in your bottom paragraph) is not needed here:
- I tested it like so with a PNG render underneath File:Death-rate-smoking,1996.svg and it worked very well.
- If there was a tool where one can e.g. enter a petscan ID and it makes these requests the other thing needed would be
- the small code that checks for the latest plausible year-number (and either in the first few lines / title or not in the same line as
Data source
) - a bot that adds the categories to the files accordingly.
- the small code that checks for the latest plausible year-number (and either in the first few lines / title or not in the same line as
- Is there a developer here who is interested in building these three missing parts assuming they don't also exist already? Prototyperspective (talk) 15:37, 19 July 2024 (UTC)
- https://ocr.wmcloud.org/ interesting tool. Quite surprising what OCR on photos actually gives. I tried:
- Both found "rue des lauriers", but the first also a motto and the second part of sticker from a key service on the pole ;)
- Maybe OCR could be added automatically on upload and stored somehow to be searchable. Possibly, as structured data so it's editable. Enhancing999 (talk) 10:49, 22 July 2024 (UTC)
- About SVG: ideally the text would be rendered on the file description page separately. Maybe that's something that can be added through LUA directly on Template:Information Enhancing999 (talk) 17:46, 22 July 2024 (UTC)
- I added a request for that at Template_talk:Information#Output_SVG_text. Enhancing999 (talk) 10:18, 29 July 2024 (UTC)
- Prototyperspective, you stated "I'd like to categorize Our World in Data maps by the year of the data into Category:Maps of the world by year". I think that is a great idea for the digital maps of the 21st century and I have done this a lot (manually) for hundreds of OWiD maps. However, I'd like to prevent you from going overboard once you finished with the OWiD maps: Please do not categorize old maps by year, including old maps of the world. Reprints, republications, entry errors and the natural delay between surveying and publishing the final maps, means that almost all older maps (before ~1990s) should preferably get organized/categorized by decade as the finest granularity. All my best wishes for the OWiD project, --Enyavar (talk) 12:28, 7 August 2024 (UTC)
The XML in the uploaded file could not be parsed
editHello! I wanted to created some map. I got free baselayer in PNG, opened Inkscape and made import of PNG file in software. After that I've added several lines and symbols and saved the result in SVG. If I try to upload the result to Commons, I see "The XML in the uploaded file could not be parsed". One hypothesis is that problem is in embedded PNG-layer, but, as I remember, there are SVG-files in Commons, which contain raster layers. Size of file is 12 Mb. Microsoft Edge opens file normally. What does cause the uploading error? It is possible to download the file for its checking. Perhaps, there is some web service, which cand repair structure of document, if it is broken? But, indeed, I'm not sure, that there file is broken: it is simple (raster layer, a few lines and symbols) and is not huge. Dinamik (talk) 09:44, 27 July 2024 (UTC)
- We do not allow uploads of svgs with images inside of them. Its is often misused and it creates potential security problems because our filescanners do not work on those embedded images. —TheDJ (talk • contribs) 08:07, 28 July 2024 (UTC)
- Did such limitation exist in Commons always? I believe, that, for example, first versions of this file have embedded baselayer. Dinamik (talk) 09:56, 28 July 2024 (UTC)
- Probably not, see Category:Fake SVG. Enhancing999 (talk) 10:16, 28 July 2024 (UTC)
- Commons has always allowed files to have embedded bitmaps, but those bitmaps must use the data: scheme. Files with external URLs are now blocked from uploading. Furthermore, the Commons rasterizer will not fetch external URLs, so such a base layer would no longer display. All the versions of the St. Petersburg map display, so there would not be an external URL. Glrx (talk) 22:57, 28 July 2024 (UTC)
- Probably not, see Category:Fake SVG. Enhancing999 (talk) 10:16, 28 July 2024 (UTC)
- Did such limitation exist in Commons always? I believe, that, for example, first versions of this file have embedded baselayer. Dinamik (talk) 09:56, 28 July 2024 (UTC)
- The file is over 10 MB. At one point, SVG uploads were limited to 10 MB, but I do not believe the is still the case.
- The file is mostly an embedded PNG. Following that, there are some
path
andflowRoot
elements. Thepath
elements should be OK, but theflowRoot
is not supported. It was described in an SVG 1.2 draft, but that draft was not accepted. The element does not exist in the SVG 2.0 spec. - WMF supports SVG 1.1. Even if you could upload the file, it would not display as you would expect.
- I do not see a reason for the XML error. W3's validator finds 67 errors, but they only involve normal Inkscape, sodipodi, and RDF extensions or the bogus
flowRoot
elements. - Glrx (talk) 23:15, 28 July 2024 (UTC)
- Running rsvg-convert (latest version, 2.58) on that SVG gives an error without the
--unlimited
option, which is described as "The XML parser has some guards designed to mitigate large CPU or memory consumption in the face of malicious documents. It may also refuse to resolve data: URIs used to embed image data in SVG documents." Dexxor (talk) 07:17, 29 July 2024 (UTC)- Yes, i think the most likely answer is that mediawiki is not setting LIBXML_PARSEHUGE, which limits the max size of a text nodes and attributes to 10 (Decimal) megabytes. As embedded images are stored as base64 data: urls, this would limit the max size to 10mb after base64 encoding (in practise about 6.98 MiB raw size). As far as I know we fully allow embedded images in SVG if they are under that limit, however they are usually not a good idea. If it was important to commons that these types of files be uploaded, we might be able to add the flag, but I'd prefer to keep the flag off if it isn't really needed. Bawolff (talk) 21:36, 18 August 2024 (UTC)
- Running rsvg-convert (latest version, 2.58) on that SVG gives an error without the
Questions about "files" in Special:UncategorizedPages
editThe past few days I have tried to clean up Special:UncategorizedPages. For the real gallery pages that is not a problem. But the vast majority of the pages in this list have the format "File:" (at the moment about 95%). That seems not OK to me. There are three types of files here:
- Embedded files (term made up by me): Files that do not contain any image or other medium, but only text, a table or a program; there is a link in another file (main file) to the embedded file. Sometimes the creator could move the text to the main file and ask for deletion of the embedded file. Question Can this be solved, can the content of the embedded files be moved to the main files or does another solution exist? Examples:
- Commons:Deletion requests/Files found with File:Wikipedia on GLAM-Tour Kulturkooperationen für lokale Wikipedia-Gruppen.webm; see also: Commons:Translators'_noticeboard#A_handful_of_File:_namespace_timed_text_translations_still_exist + Commons:Village_pump/Archive/2016/09#What_are_these_files? (so this is an eight year old problem) + Meta:Community Wishlist/Wishes/A tool for auto-transcription to speed up the creation of TimedTexts subtitles for videos on Commons (which can perhaps be a solution?)
- File:Alignment chart.png/author + File:Alignment chart.png/source; both used in File:Alignment chart.png
- File:GFDL (English).ogg/Warning 1
- File:LGBT rights world map.svg/description; used in File:World laws pertaining to homosexual relationships and expression (duplicate).svg
- The vast majority of the "Files" on this list contain a request for deletion or a note that probably copyright violation is the case. They obviously do not belong here. Question How do these files end up here? How can they be removed from this list?
- The rest of the files look correct: they have an image or other medium, categories and are not nominated for deletion. But I cannot find them in the categories mentioned in the files. So I think something is wrong with this files, but I do not know what. Question Why are these files on this list? How can they be removed from it? Examples:
- File:"Principes_de_composition_de_Mr_Bernier,_ancien_Maitre_de_Musique_de_la_S.te_Chapelle,_à_Paris"_-_btv1b10868810x_(45_of_57).jpg
- File:2024-06-13_Archiv_des_Deutschen_Museums_30.jpg
- File:2024 Save the Core - Talco - by 2eight - 9SC1171.jpg
- File:Lettre_de_Jean_Baptiste_Victor_Mohr_à_Monsieur_Achille_Gouffé,_10_Février_1857_(manuscrit_autographe)_-_btv1b108806264_(3_of_3).jpg
- File:Parque_Estadual_da_Pedra_Branca_X_Diego_Monsores_(02).jpg
- File:Parque_Estadual_da_Pedra_Branca_X_Diego_Monsores_(03).jpg
- File:Séismes._Iles_Lipari_(dossier_239)_-_btv1b10875214q_(065_of_233).jpg
- File:Uni8F29_NotoSansSC-Light.svg
- File:Finnish_Transport_Safety_Agency.png
- Files starting with File:GCE Kannur
- File:High_School_-_geograph.org.uk_-_6526491.jpg
- File:Willoughby_St_Jul_2024_68.jpg
- File:Xenia - Hub District (BAP) - 8516245531.jpg
JopkeB (talk) 06:33, 5 August 2024 (UTC)
- In your list, #1 seem to be fake subpages. Subpages because they use the "/" format, fake because file namespace doesn't allow for subpages.
- Could be that pages with #3 need a null edit for the categories to be added completely. Enhancing999 (talk) 10:28, 5 August 2024 (UTC)
- Thanks @Enhancing999: for your remarks.
- @1 I understand. But they cannot be just deleted because the filenames have a wrong format. One way or another they should be replaced by something else. I hope someone has a solution.
- @2 I hope someone else can shed some light on that.
- @3 I carried out a null edit in some files and after that at least they were in the categories. I'll wait a few days and then will check wether these files are not in the refreshed list anymore. If that is so, I'll give the rest the same threatment. But it is strange that they need such a threatment at all; can we not just prevent it?
- JopkeB (talk) 11:16, 5 August 2024 (UTC)
- For #3 a bot could do that on a regular basis. Do pages appear there if they are only in hidden category? #2 may be the same. Isn't there a separate process for uncategorized images? Maybe this special page isn't needed for that.
- About #1: if there is no associated file, the content should be moved to the appropriate place. This can be the actual file description page, template namespace or timedtext namespace. Once moved, this should be deleted. Enhancing999 (talk) 11:26, 5 August 2024 (UTC)
- Isn't there a separate process for uncategorized images? Yes, Special:UncategorizedImages exists. --Geohakkeri (talk) 11:31, 5 August 2024 (UTC)
- And there are Category:Uncategorized files, Category:Files needing categories and their subcategories, files are also put there in automated processes. And what is the difference with Special:UncategorizedImages? JopkeB (talk) 15:49, 5 August 2024 (UTC)
- The main difference is that the special page is populated by the wiki software itself. Or what you meant to ask? I don’t think I quite understood. --Geohakkeri (talk) 16:37, 5 August 2024 (UTC)
- Yes, thanks, this is the answer to my question. JopkeB (talk) 14:02, 7 August 2024 (UTC)
- The main difference is that the special page is populated by the wiki software itself. Or what you meant to ask? I don’t think I quite understood. --Geohakkeri (talk) 16:37, 5 August 2024 (UTC)
- And there are Category:Uncategorized files, Category:Files needing categories and their subcategories, files are also put there in automated processes. And what is the difference with Special:UncategorizedImages? JopkeB (talk) 15:49, 5 August 2024 (UTC)
- A bot is fine. But shouldn't it be investigated why these files appear in the list? Then it may be possible to solve the problem and then we do not need a bot. There are many, many more files with hidden categories and files that are nominated for deletion which do not appear in the list. JopkeB (talk) 15:54, 5 August 2024 (UTC)
- Isn't there a separate process for uncategorized images? Yes, Special:UncategorizedImages exists. --Geohakkeri (talk) 11:31, 5 August 2024 (UTC)
- I tried the null edit on the (non-deleted) pages. Let's see if it works. Also I fixed #1.2. above. Enhancing999 (talk) 22:06, 5 August 2024 (UTC)
- Also, I made a deletion request for #1.3 Enhancing999 (talk) 10:08, 6 August 2024 (UTC)
- Also, a rename request for #1.4 (move to template namespace). Enhancing999 (talk) 10:17, 6 August 2024 (UTC)
- @Enhancing999: I've tried to rename File:LGBT rights world map.svg/description, but I get an error: "Cannot move file to non-file namespace." I'm only a filemover, though; maybe an admin can do better. If not you may need to copy it (with proper attribution) and then get the current page deleted. --bjh21 (talk) 14:28, 6 August 2024 (UTC)
- I did some testing at testwiki: it's not possible to move directly, even for admins. A workaround seems to be Special:Import. Enhancing999 (talk) 23:14, 6 August 2024 (UTC)
- Didn't work, but it's moved now. Enhancing999 (talk) 23:38, 6 August 2024 (UTC)
- @Enhancing999: I've tried to rename File:LGBT rights world map.svg/description, but I get an error: "Cannot move file to non-file namespace." I'm only a filemover, though; maybe an admin can do better. If not you may need to copy it (with proper attribution) and then get the current page deleted. --bjh21 (talk) 14:28, 6 August 2024 (UTC)
- For a better overview, I propose an update to the introductions of the special pages, see MediaWiki talk:Uncategorizedpages-summary and MediaWiki talk:Uncategorizedimages-summary. Enhancing999 (talk) 10:22, 6 August 2024 (UTC)
- Until #1.1 is solved, I made Category:Timed Text in file namespace to get them categorized (someone needs to mark them for translation for the category to appear in all other subpages). Enhancing999 (talk) 10:30, 6 August 2024 (UTC)
- Thanks @Enhancing999: for your remarks.
- (@2) It's a old MediaWiki file cache error/bug. It occurs sometimes when using mass deletion tools like AjaxQuickDelete, Bad Old Ones, Twinkle, etc. One or two files' deletions remain incomplete, deleting admin sees the red link, so unless they refresh the page or check the deletion log, they are unaware. The files have categories but not appear in them, and eventually show up as uncategorized, they may also end up with corrupt thumbnails. Basically it's a state of partial deletion, lol. Purging doesn't help, but the issue can be easily fixed by just manually deleting the files. One can also add the deletion template again to make them appear in the deletion category for admin attention. For the current report, I have deleted all except those I was unsure about. -- CptViraj (talk) 20:08, 5 August 2024 (UTC)
- Thanks @CptViraj: for the explanation and deleting many files. I have still some questions:
- Since I have no rights to delete files manually, next time it is best that I add the deletion template again?
- What about the files with a Template:No permission since, like File:Logo-thanh-pho-Sa-Dec.webp? Can they get off the list too? Would the solution also be adding the template again?
- Solving the bug(s) is not an option?
- JopkeB (talk) 06:06, 6 August 2024 (UTC)
- No need to add a new deletion template. A null edit is enough to get the file categorized again. --Geohakkeri (talk) 08:13, 6 August 2024 (UTC)
- Null edits will only be useful when the file has a speedy deletion template (SD, copyvio, G10, F10, etc.) used, so it will be categorized in a speedy deletion category. But when the file has a DR template, it doesn't help as the DR request is already closed, there is no category for them, it will still go unnoticed. Same for any F5 (no license/source/permission) file, the respective maintenance category is already deleted, so it will too go unnoticed. Therefore in these two cases, it's better to just add {{SD}} mentioning the error to have admin attention. -- CptViraj (talk) 10:55, 6 August 2024 (UTC)
- Would you have a sample file? Enhancing999 (talk) 10:59, 6 August 2024 (UTC)
- Just a nitpick: old {{Npd}} cases are perfectly visible as they appear directly in Category:Media missing permission. --Geohakkeri (talk) 11:02, 6 August 2024 (UTC)
- Null edits will only be useful when the file has a speedy deletion template (SD, copyvio, G10, F10, etc.) used, so it will be categorized in a speedy deletion category. But when the file has a DR template, it doesn't help as the DR request is already closed, there is no category for them, it will still go unnoticed. Same for any F5 (no license/source/permission) file, the respective maintenance category is already deleted, so it will too go unnoticed. Therefore in these two cases, it's better to just add {{SD}} mentioning the error to have admin attention. -- CptViraj (talk) 10:55, 6 August 2024 (UTC)
- @JopkeB:
- (1) Yes, but no need to add the same deletion template, just use a simple speedy template ({{SD}}) mentioning the error and original deletion reason, also see my above message regarding null edits.
- (2) The same applies here; just use the SD template as I said above. I think you are asking this because you noticed that all (or most) of the files left by me are these F5 ones, but it is not related, I have deleted many of these files, too.
- (3) Solving a bug is always a option. This isn't a Commons-only issue. Devs might already be aware of the bug; maybe there is already a Phabricator task, maybe there isn't. I don't know as I never tried digging into this as there always have been some sort of cache issues with deleting in MediaWiki, maybe I'm just lazy to deal with it as this isn't a major issue and is easily workable 😅, and these files are deleted most of the time on every maintenance report by some admin, this time it just got pilled up for a month or two. -- CptViraj (talk) 10:55, 6 August 2024 (UTC)
- Thanks, all of you, for the help. The list has been refreshed and there are now only 16 pages left. For the files about Wikipedia on GLAM-Tour Kulturkooperationen is a process going on to delete them (see Commons:Deletion requests/Files found with File:Wikipedia on GLAM-Tour Kulturkooperationen für lokale Wikipedia-Gruppen.webm). The rest looks business as usual. JopkeB (talk) 13:35, 7 August 2024 (UTC)
- Impressive from 181 to 16! The translation subpages would also disappear if someone marked the main one for translation (I made request at Commons:Translators'_noticeboard#A_handful_of_File:_namespace_timed_text_translations_still_exist).
- User talk:Test919,733,084 leaves me puzzled. Enhancing999 (talk) 13:40, 7 August 2024 (UTC)
- Did an edit on that .. maybe it works. Also an edit request for Motd. Enhancing999 (talk) 14:43, 7 August 2024 (UTC)
- It got updated, but there are still 12. "User_talk:Test919,733,084" is still there. I tried moving it around: [1]. Enhancing999 (talk) 10:16, 10 August 2024 (UTC)
- Special:UncategorizedPages is now at 5 (3 new ones). Enhancing999 (talk) 21:10, 14 August 2024 (UTC)
- It got updated, but there are still 12. "User_talk:Test919,733,084" is still there. I tried moving it around: [1]. Enhancing999 (talk) 10:16, 10 August 2024 (UTC)
- Did an edit on that .. maybe it works. Also an edit request for Motd. Enhancing999 (talk) 14:43, 7 August 2024 (UTC)
- No need to add a new deletion template. A null edit is enough to get the file categorized again. --Geohakkeri (talk) 08:13, 6 August 2024 (UTC)
- Thanks @CptViraj: for the explanation and deleting many files. I have still some questions:
File:Herrieu - Chansons populaires du pays de Vannes, 3e série, 1930.djvu not displaying
editHello,
The file Herrieu - Chansons populaires du pays de Vannes, 3e série, 1930.djvu uploaded 11 days ago is not displaying. I had almost the same issue with another DjVu file (see [2]). Could you see why ? Thanks. Gwendal (talk) 07:24, 5 August 2024 (UTC)
- Update : after a "dummy" upload with the same file, it's displaying on Commons but not on the other wikis (see : s:br:Restr:Herrieu - Chansons populaires du pays de Vannes, 3e série, 1930.djvu).
- There is the same problem with the other file File:Herrieu - Chansons populaires du pays de Vannes, 1re série, 1911.pdf : see s:br:Restr:Herrieu - Chansons populaires du pays de Vannes, 1re série, 1911.pdf--Gwendal (talk) 05:36, 16 August 2024 (UTC)
Uploadig a large svg file
editI am tring to upload this svg file. and it does not working. maybe becose it is too large. are there size limitation on svg files? what are they? Can somebody help me to upload it?
Thanks a lot, Aizenr (talk) 10:05, 7 August 2024 (UTC)
- my file is 38MB. this is much smaller then the Maximum file size. So this is probably not the main problem... Any other sugestions? Aizenr (talk) 17:17, 7 August 2024 (UTC)
- @Aizenr: Your SVG contains a large PNG, resulting in the same issue as in #The_XML_in_the_uploaded_file_could_not_be_parsed. --Dexxor (talk) 20:52, 7 August 2024 (UTC)
- thank you very much. I read now the discusion above, but I stell do not undestand what is the solution. Can I chinge something in my file that will solve the problem? If there is now other chice, I can downsize the png image.How much should I downsize it?
- Thanks again, Aizenr (talk) 04:47, 8 August 2024 (UTC)
- @Aizenr: I would simply convert the SVG into PNG using
rsvg-convert -u
or Inkscape, and then upload the PNG (it should also be much smaller, around 3MB). Dexxor (talk) 05:52, 8 August 2024 (UTC)- thank you for the sujestion. The main aim of this file was to criate an svg version of an existing png file so text in different lengwiges could be added. So I prefer to keep it as svg. If I split the png into 2 pices, do I have the same limitation for each of them or on the total? Thanks again, Rami (Aizenr, talk) 12:33, 8 August 2024 (UTC)
- @Aizenr: The PNG is already split into two pieces. I think your only option is to make the PNGs smaller, like this. ---Dexxor (talk) 20:58, 9 August 2024 (UTC)
- Isn't that problem not one of size, but that Category:Fake SVG are no longer accepted? Enhancing999 (talk) 21:03, 9 August 2024 (UTC)
- You would probably have to split it into more than two pieces. Try and aim so that no piece is larger than 6 megabytes. Bawolff (talk) 21:41, 18 August 2024 (UTC)
- @Aizenr: The PNG is already split into two pieces. I think your only option is to make the PNGs smaller, like this. ---Dexxor (talk) 20:58, 9 August 2024 (UTC)
- thank you for the sujestion. The main aim of this file was to criate an svg version of an existing png file so text in different lengwiges could be added. So I prefer to keep it as svg. If I split the png into 2 pices, do I have the same limitation for each of them or on the total? Thanks again, Rami (Aizenr, talk) 12:33, 8 August 2024 (UTC)
- @Aizenr: I would simply convert the SVG into PNG using
- @Aizenr: Your SVG contains a large PNG, resulting in the same issue as in #The_XML_in_the_uploaded_file_could_not_be_parsed. --Dexxor (talk) 20:52, 7 August 2024 (UTC)
Uploadwizard on mobile web pages, how to jump to top or bottom?
editi'm batch uploading like 50 files. it's a pain to scroll slowly up or down. i just need to click the button to submit files... RZuo (talk) 13:33, 7 August 2024 (UTC)
- using mobile opera browser might be better? they seem to have a button to jump when a page is too long. RZuo (talk) 13:35, 7 August 2024 (UTC)
Flickr2Commons tool not working for about 24 hours.
editI get this message currently-
- "Wikimedia Toolforge Error"
- "Our servers are currently experiencing a technical problem. This is probably temporary and should be fixed soon. Please try again later."
- "tools-proxy-8.tools.eqiad1.wikimedia.cloud"
It has been about 24 hours. Other Users are apparently using this same tool without any problems. Thanks, -- Ooligan (talk) 15:03, 8 August 2024 (UTC)
- Also see
- Commons:Village_pump#Flickr2Commons (9th August 2024)
- M2k~dewiki (talk) 23:42, 9 August 2024 (UTC)
- Still not working. Is this permanently broken, and I should choose a different way to do these tasks, or will it be coming back? - Jmabel ! talk 19:39, 12 August 2024 (UTC)
-
- https://phabricator.wikimedia.org/T372451 M2k~dewiki (talk) 15:56, 14 August 2024 (UTC);
- Update: User:DaxServer create a clone of F2C and it was available here, though the initial F2C was still downed. --A1Cafel (talk) 03:21, 18 August 2024 (UTC)
Extra buttons with the reply field
editAbove the I would like to add a few extra buttons to insert codes/templates in the reply field (through my person .js). Can anyone point me to an example how it is done their, then I use that code and modify it. Thanks! Romaine (talk) 14:41, 10 August 2024 (UTC)
No edit summary when undoing caption edits
editThere is no automatic edit summary when undoing caption edits (see for example File:Dancer in Sari.jpg, why is that and can it be changed? Should it then be added as a feature request om Phabricator or how to go about it? Jonteemil (talk) 18:05, 10 August 2024 (UTC)
- Yes, that is an annoying problem. I also mentioned this in #Vandals often move captions to other languages – detection needed below. Please create an issue if no issue about it exists and then please link it here and mark the section as solved. Prototyperspective (talk) 09:45, 19 August 2024 (UTC)
- Done. Jonteemil (talk) 18:35, 20 August 2024 (UTC)
Special:UncategorizedCategories says it runs once a month. It is now five days beyond that. Even once a month is a long time for this report, which used to run every three days. No doubt partly as a result of that longer interval, we have gone from having about 100-200 such categories early this year to over 1500 in the latest (July 7) report. I've dealt with several hundred of those, and would really like to have a more updated report to work from. As far as I can tell, there is no reasonable way to find uncategorized categories other than this report, and except for the rare false positive, every single uncategorized category genuinely represents a task to be done. - Jmabel ! talk 19:31, 12 August 2024 (UTC)
- Why was it changed to much less frequent runs? I think it would be better if like with Suggested Edits, categories were suggested for these (and also categories that already have categories) – for example using the categories of the Wikipedia items if the category is linked to (these could have cats with matching cats on WMC or cats that have parent cats with matching cats on WMC). Prototyperspective (talk) 23:20, 19 August 2024 (UTC)
- It's updated. Count of 11:11, 22 August 2024 is 1534. Enhancing999 (talk) 12:25, 22 August 2024 (UTC)
- phab:T369024: Seems the outcome is Special:UncategorizedPages got deactivated instead.
- @Ladsgroup can't we keep both?
- A way to speed it up could be to limit Special:UncategorizedPages to namespace 0. Pages don't need to appear there and on Special:UncategorizedFiles.
- FYI: @Bawolff, @Mdaniels5757, @JopkeB. Enhancing999 (talk) 14:00, 22 August 2024 (UTC)
- MediaWiki currently does not support setting Special:UncategorizedPages to use namespaces different than the Content Namespaces Bawolff (talk) 09:27, 23 August 2024 (UTC)
- The query for Uncategorized pages is really strange. I wonder what usecase it's meant for. It's much quicker when it's limited to namespace 0. Adding namespace 6 and filtering that for pages without files is just strange. It seems to be me that the bug needing fixing is to write correct SQL. Enhancing999 (talk) 09:42, 23 August 2024 (UTC)
- Thanks for pinging me. My questions:
- What exactly is the problem? What is the frequence of runs yet? @Jmabel: What would be a good frequence for you?
- What is the purpose of phab:T369024 exactly? Please explain in plain English. I have not enough knowledge of this tool to understand what the consequences of this task might be. Anyway: Special:UncategorizedPages should still run now and then, it should not be deactivated. If the frequence now is a problem, I think it might run weekly or perhaps even less frequent, but we cannot do without it.
- What does "limit Special:UncategorizedPages to namespace 0" mean? And in the next sentence: what does "there" mean? Now it looks like referring twice to the same list.
- JopkeB (talk) 09:34, 23 August 2024 (UTC)
- As you noted elsewhere Special:UncategorizedPages included both galleries (namespace 0) and file description pages that are empty (namespace 6). These empty file description pages appear also on Special:UncategorizedFiles.
- The query for namespace 6 in Special:UncategorizedPages is really resource intensive at Commons (this lead to the bug above). I can't really think of a cases where it's needed though.
- As the bug report isn't very clear, it lead to the wrong "fixes" and Special:UncategorizedPages is currently deactivated. Fixing it correctly should make updating Special:UncategorizedPages much faster. Enhancing999 (talk) 09:51, 23 August 2024 (UTC)
- @Enhancing999: Thanks for your explanation. How would Special:UncategorizedPages get correctly updated, what should be done? Is phab:T369024 making that happen or is it the other way around: is that the ticket implementing the deactivating and should we ask for a new ticket/task? JopkeB (talk) 11:18, 23 August 2024 (UTC)
- BTW, currently there are few pages on Special:UncategorizedPages and Special:UncategorizedFiles, but this is due to several people cleaning them up fairly thoroughly since JopeBe's report above. This does not mean the reports aren't needed or should be run less frequently. Enhancing999 (talk) 09:59, 23 August 2024 (UTC)
- Hi, Page table of Wikimedia Commons currently has more than 110M rows, its categorylinks table has more than 800M rows. Joining these two tables with such conditions is too expensive in our production and can cause issues and bring down our databases. That's why we had to reduce its frequency to once a month. We will eventually migrate these reports to hadoop and bring back its previous frequency but that's far in the future (phab:T309738). In the meantime, you can improve the condition by excluding File namespace in the query (which would make it faster), and then just run it against wikireplicas. Similar to how English Wikipedia builds reports for their needs (en:WP:DBR). Sorry for the inconvenience but we don't really have a choice. ASarabadani (WMF) (talk) 11:38, 23 August 2024 (UTC)
- MediaWiki currently does not support setting Special:UncategorizedPages to use namespaces different than the Content Namespaces Bawolff (talk) 09:27, 23 August 2024 (UTC)
@JopkeB: I guess I could live with Special:UncategorizedCategories being monthly, though certainly more often is better. Before the most recent report a couple of days ago, we'd gone 6 weeks. When you say "What exactly is the problem?" I'm not sure what you are asking. Are you asking how the page is used? Why it's a problem when it is far out of date? or something else? - Jmabel ! talk 18:55, 23 August 2024 (UTC)
- I have already get an answer to this question: there are technical problems to run lists as frequent as we would like to. JopkeB (talk) 05:14, 24 August 2024 (UTC)
Tech News: 2024-33
editLatest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.
Feature news
- AbuseFilter editors and maintainers can now make a CAPTCHA show if a filter matches an edit. This allows communities to quickly respond to spamming by automated bots. [3]
- Stewards can now specify if global blocks should prevent account creation. Before this change by the Trust and Safety Product Team, all global blocks would prevent account creation. This will allow stewards to reduce the unintended side-effects of global blocks on IP addresses.
Project updates
- Nominations are open on Wikitech for new members to refresh the Toolforge standards committee. The committee oversees the Toolforge Right to fork policy and Abandoned tool policy among other duties. Nominations will remain open until at least 2024-08-26.
- One new wiki has been created: a Wikipedia in West Coast Bajau (
w:bdr:
) [4]
Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.
File missing
editHey there! I uploaded files recently, and had an occasion twice. In rare cases, a page for a file is created, but the file itself is missing (e.g. File:POV – una Dottoressa dolcissima ti visita in ASMR.webm). How to handle this issue? Thank you very much!
--PantheraLeo1359531 😺 (talk) 11:26, 13 August 2024 (UTC)
- This happens occasionally, the easiest is to upload the file again and then delete the page. Ymblanter (talk) 15:34, 13 August 2024 (UTC)
- Thanks. Yes, the file does exist, but I uploaded a new snippet and then reset to the complete file. --PantheraLeo1359531 😺 (talk) 06:38, 14 August 2024 (UTC)
File not showing up in Category:Other speedy deletions
editI tagged File:His Girl Friday (1940, SDR).webm for speedy deletion and the page shows it as being in Category:Other speedy deletions, but a day later it still doesn't show up in the category. Is there a known reason that would happen? hinnk (talk) 23:16, 13 August 2024 (UTC)
- I can see the file in the respective cat --PantheraLeo1359531 😺 (talk) 06:44, 14 August 2024 (UTC)
- It wasn't showing up after ~24 hours. After a little longer it did appear for me in the category so I figured it was just a caching issue, but now it's gone again. hinnk (talk) 18:12, 14 August 2024 (UTC)
- And now it's back. This is super weird, I don't understand why it would repeatedly disappear and reappear like this. No changes have been made to the file or category pages, the only thing that's changed is other pages being added to or removed from the category. hinnk (talk) 19:01, 14 August 2024 (UTC)
- It wasn't showing up after ~24 hours. After a little longer it did appear for me in the category so I figured it was just a caching issue, but now it's gone again. hinnk (talk) 18:12, 14 August 2024 (UTC)
I'm wondering if it might be related to phab:T129621/phab:T132921. The whole reason for opening the speedy deletion request was that reupload was giving an error because it couldn't acquire the lock. If any admins come across this, it'd be helpful to know what happens when you try and perform the deletion. hinnk (talk) 22:12, 14 August 2024 (UTC)
- It's known to happen for template based categories. Category:Non-empty_category_redirects had plenty. Enhancing999 (talk) 13:01, 15 August 2024 (UTC)
toollabs:para/Commons:Special:NewFiles: 504 Gateway Timeout
edit“Webservice request timed out. This URI is managed by the para tool, maintained by Para.” --Geohakkeri (talk) 11:30, 15 August 2024 (UTC)
Help with adding parameter to template
editWould someone be so kind as to help me add a parameter to Template:PD-Sjöfartsverket? The parameter would function in the same way as the “cat” parameter from Template:Taken in, such that the code {{PD-Sjöfartsverket|cat=no}} would disable automatic categorization of transcluding pages. I can take care of updating the documentation afterwards. —VulpesVulpes42 (talk) 18:39, 15 August 2024 (UTC)
- Why would it have that? Template:PD-US doesn't have it. Enhancing999 (talk) 19:12, 15 August 2024 (UTC)
- @Enhancing999: This template has a much, much more limited use case. It automatically adds files to Category:Diagrams of water transport signs in Sweden. However there is also Category:Diagrams of water transport signs in Sweden obsoleted by SVG replacement, and files there should of course not be in the parent category as well. But due to the automatic categorization currently being undisableable, there is no way of removing files from the redundant parent category without also removing the template. —VulpesVulpes42 (talk) 19:21, 15 August 2024 (UTC)
Could translated pages be hidden from categories?
editFor example see Category:Commons video resources – all those translated pages in that category make it cluttered, hard to go through the pages, and bury pages on the next page.
Would it be possible to hide translated pages so that only one is shown?
- One could have the links to the translated pages at the top of that page
- One could automatically open the respective translated page when opening the page depending on one's language settings
- (and there are more alternatives)
Prototyperspective (talk) 22:00, 16 August 2024 (UTC)
- Much less useful but still useful would be if the translated pages were also hidden or hidable in the File uses on commons section of file pages because it makes it cluttered and hard to see where a file is used; example. Prototyperspective (talk) 10:08, 19 August 2024 (UTC)
- It's possible to place them in a subcategory, but I don't think status quo is necessarily an issue. They should be categorized in any case. Enhancing999 (talk) 12:17, 22 August 2024 (UTC)
- It's currently not a big problem but it makes things far less overseeable and buries things on other pages or beneath the cluttered page. Imagine if there were 300 translated versions of a page which is just roughly the number of languages with a Wikipedia and not even all notable languages, it would make the category barely usable to find and organize things. Manually subcategorizing wouldn't be a good solution because it requires people to spend time manually doing so and new translations will be directly in the category again. Maybe there could be some multilingual redirect page that redirects to whatever language the user has configured if that language version exists and English otherwise? Prototyperspective (talk) 12:35, 22 August 2024 (UTC)
- It's possible to place them in a subcategory, but I don't think status quo is necessarily an issue. They should be categorized in any case. Enhancing999 (talk) 12:17, 22 August 2024 (UTC)
Vandals often move captions to other languages – detection needed
editWhat's going on with people somewhat apparently systematically moving captions from one language to a false one? This is happening frequently and often, if not usually, not detected & reverted by editors.
Could a detection of this please be developed? Compared to other vandalism that is well-detected automatically on Wikipedia by for example ClueBot NG (acc) I think it would be easy to detect if a caption got moved from the original language to another one, if possible with language detection so it also checks if the language it was moved indeed does not match the caption text language. Maybe a better place to ask about this would be the Bot requests or the ClueBot NG talk page / code repo.
- Examples
Previously I was wondering how to search my contributions (edit summaries). I was trying to use standardized phrases or terms in edit summaries so I can easily look them up later, for example to replace texts I previously added with templates. I found the following useful tool and because I try to always use term "rvv" when reverting edits that are or seem to most likely be vandalism many examples of such edits can be found here. These are the most recent ones: 1 2 3 4 5 6 7. By the way, I think captions are more a problem or redundant than anything else since there already is the machine-translatable description field (it's useful sometimes when descriptions are long but these could also instead have a short version at the top or be shortened). Another problem is that when undoing changes to captions there is no prefilled edit summary so one has to tediously copy the contributions link of the user and write the edit summary anew. Prototyperspective (talk) 13:04, 17 August 2024 (UTC)
- Given how often this happens (I was able to find a couple of recent instances in a few minutes of looking through RecentChanges) and how weirdly specific of an action it is (it only changes the language of the caption, never its content), I suspect this is a UI/UX issue, not deliberate vandalism, and I suspect it can be addressed by making some changes, like inserting a confirmation dialog when changing the language of an existing caption.
- Does anyone know how/where the caption editing interface is implemented, and who's responsible for it? (I also have a couple of gripes about the language picker used in the caption interface - especially its appearance on mobile.) Omphalographer (talk) 05:16, 18 August 2024 (UTC)
- you're right. try File:JPG Test.jpg.
- given existing captions, users can change the language and then click publish. that results in the aforementioned problematic edits. RZuo (talk) 05:30, 18 August 2024 (UTC)
- There is the new feature to require captchas for action defined in an abuse filter. I thought about requiring captchas for all IP edits on captions. This might reduce these king of accidental edits. GPSLeo (talk) 05:36, 18 August 2024 (UTC)
- Could be but sometimes they do change the text or change multiple languages at once and it seems like it's always only done by new or unregistered users who sometimes did some other different problematic changes.
- Another thing that could be done is automatic detection of the language and displaying at least a warning or adding the file to a maintenance cat if it doesn't match the specified language – this would also be useful since often people specify the wrong language even at upload.
- @RZuo: What do you mean? Why would Omphalographer be right in that this is a UI/UX issue? What you described is exactly the expected behavior: changing the language and then clicking publish, how does this suggest it's not vandalism and a UI issue? I don't see why you and GPSLeo think it would be accidental / an UI issue if one has to deliberately click "publish".
- Even if both captchas are added and they indeed reduce these changes, I still think there should be automatic detection of these changes as well as other likely vandalism. Why is ClueBot_NG not active on WMC? Does ORES work with WMC? There's lots of vandalism here (not just in the captions and the relatively hidden structured data) and I've come across multiple cases where it stayed on a relatively large page for a year or so. Bots/tools could build a queue of edits to check as well as automatically revert edits that are very likely to be vandalism. Moreover, they could learn from edits that specify that the reverted edit was likely vandalism or similarly nonconstructive (it doesn't matter if deliberate or not) if terms like RVV are used.
- Prototyperspective (talk) 10:07, 18 August 2024 (UTC)
- People are on a website an they see a language selection. They want to change the language of the text. They do not expect that they are able to change the content on a website where they do not even have an account. The button says publish but how is this translated into different languages and are these translated terms always that clear? GPSLeo (talk) 10:37, 18 August 2024 (UTC)
- This type of "vandalism" is so specific that it would basically have to be a very dedicated LTA who utilizes IPs from all around the world; it's disruptive but I don't think it's deliberate. Gnomingstuff (talk) 04:49, 26 August 2024 (UTC)
- Yes I think the explanation by GPSLeo is quite plausible. The main subject of this thread however is detecting such edits and auto-reverting them (also useful for other unconstructive edits / reducing maintenance workload) regardless of whether or not they are intentional or not. I don't know what you mean by "LTA" but it could have also been many people that found this to be an effective type of vandalism as it's often not detected and reverted but I already think inadvertent edits may be more likely. Maybe there could be some special confirmation box asking if the user really wants to publish that to the file data without using the word "publish" because maybe those users didn't understand that word. In any case, detecting if the text in the caption matches the languages seems useful and needed in any case, for example because many users add captions in English to other languages at upload or similar things. Prototyperspective (talk) 09:52, 26 August 2024 (UTC)
- Something in the GUI needs improvement. It's plausibel that changing the language and saving it actually adds an additional language rather than deleting one as well: [5] Enhancing999 (talk) 10:10, 26 August 2024 (UTC)
- LTA = long-term abuse, people who vandalize in their specific identifiable way for months or years.
- The main tells on these are edits by infrequent editors with 2 changes per file. Not much help though once it's out of recent changes. I've found some by searching for captions with mismatched languages e.g. "Spanish the," but obviously that only works with certain patterns.
- The problem though with any kind of auto-reverting is that it would have to not catch people fixing this stuff, especially when it's undetected. Gnomingstuff (talk) 11:29, 27 August 2024 (UTC)
- Many don't have 2 changes per file. I think the main indicators are 1. language does not match specified caption language (no other indicator is needed; check language auto-detection of Google Translate or DeepL to see what I mean with language detection) 2. user isn't an editor with many unreverted edits (would only use this indicator early on as language mismatching is a general problem).
- I don't understand what you mean with The problem though with any kind of auto-reverting is that it would have to not catch people fixing this stuff. People fixing this stuff would move the caption back to its matching language or remove the flawed captions so they wouldn't be detected. Prototyperspective (talk) 11:51, 27 August 2024 (UTC)
- Yes I think the explanation by GPSLeo is quite plausible. The main subject of this thread however is detecting such edits and auto-reverting them (also useful for other unconstructive edits / reducing maintenance workload) regardless of whether or not they are intentional or not. I don't know what you mean by "LTA" but it could have also been many people that found this to be an effective type of vandalism as it's often not detected and reverted but I already think inadvertent edits may be more likely. Maybe there could be some special confirmation box asking if the user really wants to publish that to the file data without using the word "publish" because maybe those users didn't understand that word. In any case, detecting if the text in the caption matches the languages seems useful and needed in any case, for example because many users add captions in English to other languages at upload or similar things. Prototyperspective (talk) 09:52, 26 August 2024 (UTC)
Geolocate nepal photos
editDate | Name | Thumbnail | Size | Description |
---|---|---|---|---|
09:31, 23 March 2019 | Chandragiri Cable Car, 2018-04-21 (2).jpg (file) | 5.93 MB | User created page with UploadWizard | |
07:18, 23 March 2019 | Chandragiri Cable Car, 2018-04-21 (1).jpg (file) | 6.31 MB | User created page with UploadWizard |
plz take a look at User:Simasuru's photos. i located one of them to 27.685, 85.2138. do you think that's right? once it's confirmed, i will rename the files. RZuo (talk) 05:46, 18 August 2024 (UTC)
- I'd tend to have these tests deleted. Enhancing999 (talk) 12:21, 18 August 2024 (UTC)
- Rename yes, but meets COM:SCOPE, right? --PantheraLeo1359531 😺 (talk) 11:55, 19 August 2024 (UTC)
- @RZuo: it seems to match Google Street View. Of course locating an image retroactively might result in some minor error margins, but it should not be a huge deal. I've moved them to a more appropriate file name now. —Matrix(!) {user - talk? -
uselesscontributions} 15:13, 27 August 2024 (UTC)
Can not find out what causes an error
editI apologize, first attempt to upload three images: https://commons.wikimedia.org/w/index.php?title=Special:UploadWizard&campaign=CHM-DE-HE&categories=Cultural+heritage+monuments+in+Erbach+%28Odenwald%29&description=Stra%C3%9Fenbr%C3%BCcke%2C+Eisenbahn+%28auf+der+Grenze+zu+Michelstadt%29&descriptionlang=de&fields%5B0%5D=952486&fields%5B1%5D=49.67158%2F8.99068 - I get "Beim Ausfüllen der Formulare sind 2 Fehler aufgetreten. Bitte die Fehler berichtigen und erneut speichern." - 2 Errors occured, please correct and save again - But I have no idea what errors. The only thing red on the page is the browser spellcheck in text fields - How can I find out? Shyof (talk) 15:14, 19 August 2024 (UTC)
- Maybe it's the same problem as this one?: Commons:Village pump#No error message for same file names in Upload Wizard. It should probably be reported at this page instead of here. Prototyperspective (talk) 12:02, 20 August 2024 (UTC)
- Hmm, that could have been possible - uploading the files one-by-one without changing anything else worked (the names differed but had a common prefix)... Thanks for the hint! Shyof (talk) 15:54, 20 August 2024 (UTC)
Discrepancy between search and PetScan result based on same search
editHi all, i've noticed that since some days (can't tell exactly :-( ), there's a discrepancy between search results on Commons and PetScan search results based on the very same search. For example, a search on Commons with search string Tschubby map incategory:"Media_missing_infobox_template"
currently yields 1,087 hits. Executing exactly the same search on PetScan yields between (!) 1,039 and 1,065 results (I did multiple search runs with same psid
). Repeating the search on PetScan yields different results, while the commons search result is stable. This looks like a PetScan issue, but before reporting there, I wonder if others found a similar behaviour, or if I'm doing something wrong. Fl.schmitt (talk) 16:41, 19 August 2024 (UTC)
- I noticed some discrepancies, but they seemed minor. Petscan uses different channels that might fail or occasionally have their own problems. Maybe you want to use the category instead. Enhancing999 (talk) 16:39, 21 August 2024 (UTC)
Tech News: 2024-34
editLatest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.
Feature news
- Editors who want to re-use references but with different details such as page numbers, will be able to do so by the end of 2024, using a new sub-referencing feature. You can read more about the project and how to test the prototype.
- Editors using tracking categories to identify which pages use specific extensions may notice that six of the categories have been renamed to make them more easily understood and consistent. These categories are automatically added to pages that use specialized MediaWiki extensions. The affected names are for: DynamicPageList, Kartographer, Phonos, RSS, Score, WikiHiero. Wikis that have created the category locally should rename their local creation to match. Thanks to Pppery for these improvements. [6]
- Technical volunteers who edit modules and want to get a list of the categories used on a page, can now do so using the
categories
property ofmw.title objects
. This enables wikis to configure workflows such as category-specific edit notices. Thanks to SD001 for these improvements. [7][8]
Bugs status
- Your help is needed to check if any pages need to be moved or deleted. A maintenance script was run to clean up unreachable pages (due to Unicode issues or introduction of new namespaces/namespace aliases). The script tried to find appropriate names for the pages (e.g. by following the Unicode changes or by moving pages whose titles on Wikipedia start with
Talk:WP:
so that their titles start withWikipedia talk:
), but it may have failed for some pages, and moved them to Special:PrefixIndex/T195546/ instead. Your community should check if any pages are listed there, and move them to the correct titles, or delete them if they are no longer needed. A full log (including pages for which appropriate names could be found) is available in phab:P67388. - Editors who volunteer as mentors to newcomers on their wiki are once again able to access lists of potential mentees who they can connect with to offer help and guidance. This functionality was restored thanks to a bug fix. Thank you to Mbch331 for filing the bug report. You can read about that, and 18 other community-submitted tasks that were resolved last week.
Project updates
- The application deadline for the Product & Technology Advisory Council (PTAC) has been extended to September 16. Members will help by providing advice to Foundation Product and Technology leadership on short and long term plans, on complex strategic problems, and help to get feedback from more contributors and technical communities. Selected members should expect to spend roughly 5 hours per month for the Council, during the one year pilot. Please consider applying, and spread the word to volunteers you think would make a positive contribution to the committee.
Learn more
- The 2024 Coolest Tool Awards were awarded at Wikimania, in seven categories. For example, one award went to the ISA Tool, used for adding structured data to files on Commons, which was recently improved during the Wiki Mentor Africa Hackathon. You can see video demonstrations of each tool at the awards page. Congratulations to this year's recipients, and thank you to all tool creators and maintainers.
- The latest Wikimedia Foundation Bulletin is available, and includes some highlights from Wikimania, an upcoming Language community meeting, and other news from the movement.
Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.
Options to "Use this file" do not appear for some files
editFor some files, the top bar only contains links to the file history and usage. Some examples:
- File:The Complete Doom Accessory Pack Volume IV CD-ROM.jpg
- File:Colliers Times New Roman letter perfect.jpg
- File:Biomet-logo.gif
It does not contain the option to download or use the file. What's the reason for this inconsistency? Ixfd64 (talk) 20:30, 20 August 2024 (UTC)
How to show different contents on mobile?
editI don't know why nobody is replying at Talk:Main Page#Edit requests but the Main page is referring to the links to the right for years while on mobile, which by now could well be how the majority of people land on this page by now, the links are not "on the right" but down. Is there a way for this part to be different on mobile? Extension:MobileDetect doesn't seem to work on WMC. Prototyperspective (talk) 20:19, 21 August 2024 (UTC)
- You use media queries and the skin classes and fix the templatestyles. —TheDJ (talk • contribs) 18:10, 25 August 2024 (UTC)
How to use temporal media fragments to link to audio chapters in file description?
editWith videos it is possible to specify a start (and end) time of the video – see Commons:Video#Temporal media fragments.
- Is it possible to link to times of the videos in the file description?
- Is that already possible somehow for audio files?
I think it would be very useful if in the file description of Spoken Wikipedia audio files, like those that I just uploaded, there were links to the different sections of the article. If one is only interested in a particular section of the article one could jump to it directly and listen to only that. It also gives some orientation where one is currently at when listening to articles. Linking to chapters is possible on YouTube and probably many users have already noticed some ways this can be useful. For example, I'd like to link the timestamps in the description here and add timestamp links to the different exercises here.
This doesn't seem to work with videos either as one would have to append ?start=00:26
but can only append things like #start=00:26
. Is there some issue about these things? Prototyperspective (talk) 12:46, 22 August 2024 (UTC)
- There isn't really a universal chapter methodology in HTML5 video. The HTML5 idea is that you create your own VTT file (which we don't yet support). These could refer to Commons link (they can be anything you like, as all support for it is completely custom work). Then you write custom Javascript to listen to the timedtext events, read the 'text' (a description and link in this case) and go do something with that (wrapped inside a videojs custom plugin).
- Related tickets about this in phabricator. phab:T116154, phab:T301826. —TheDJ (talk • contribs) 18:08, 25 August 2024 (UTC)
- You can link to start points in the file description page using start= and end= query params like:
- File:Using_a_Power_Rack_in_bodybuilding,_powerlifting,_strength_training,_resistance_exercise.webm?start=00:10&end=00:15, but it is not that useful, as there is no autoplay etc. —TheDJ (talk • contribs) 18:14, 25 August 2024 (UTC)
- The two issues you linked don't seem to be related to this at all. What you wrote is about annotations like a link or a balloon message informing about an inaccuracy in the video at specific times of the video. In contrast this post here is about video chapters and linking to different times of the video in the file description.
- Yes, I know that one can link to start points that way and explicitly said that in the last paragraph. As said, one would need to link to these from there like
#start=00:26
because when including a link like your example, it opens the video in a new tab at that starting point instead of directly jumping to it. - This is really important for Spoken Wikipedia where you may like to jump to a particular section. There also needs to be some proper audio player, with the current one only being the fallback, that for example is wider so you can jump to some timing better and with a -10 seconds feature.
- Prototyperspective (talk) 10:49, 26 August 2024 (UTC)
Bot no longer working to warn Wikivoyage about nominations for deletion?
editAs I stated in User talk:IronGargoyle#No notifications to Wikivoyage anymore? (edited for brevity and relevance):
Commons delinker delinked an image on Wikivoyage that was deleted in the Commons:Deletion requests/Files in Category:Patuxai thread. Why was there never a notification on the voy:Talk:Indochina Wars page that this file was nominated for deletion? We're supposed to get such notifications.
Was there a malfunction of the relevant bot, or was a unilateral decision made on Commons to stop giving notice to other sites such as Wikivoyage that use thumbnails of images on Commons in our articles but can choose to locally upload images that we consider important and are problematic merely due to a lack of commercial freedom of panorama? If the bot malfunctioned, please try to find out why and ensure the problem does not recur. However, if a unilateral decision was made to stop giving sister sites such as Wikivoyage the chance to make our own decisions on affected files before they are deleted on this site, I cannot state too strongly that that is absolutely unacceptable! We cannot return to the days when slews of images were deleted from Wikivoyage articles without notice. I'll look forward to your response and explanation of how you will prevent this problem from recurring. Thanks, everyone! -- Ikan Kekek (talk) 04:30, 25 August 2024 (UTC)
- I do not think this is specific to Wikivoyage, there were also bot notifications about imminent deletions in the English Wikipedia, but I do not see them anymore. May the the bot got abandoned, lost the flag or smth else happened. Ymblanter (talk) 18:23, 27 August 2024 (UTC)
- I'm not entirely sure if that wikivoyage was on the bot, however I do know that the bot was recently broken for quite a while: https://phabricator.wikimedia.org/T339145 There was a surprising small amount of communities that adopted/approved that bot when the community tech team finally rewrote it in 2018 btw. Really sad, if you realize how much money the foundation poured into reworking that bot. —TheDJ (talk • contribs) 20:17, 27 August 2024 (UTC)
- We need to be notified when images we use as thumbnails are nominated for deletion. If the bot isn't working, what's the solution? -- Ikan Kekek (talk) 23:03, 27 August 2024 (UTC)
Croptool connections/authorization
editHey guys, I use croptool to correctly size images from Wikipedia commons for use as pagebanner in wikivoyage. Unfortunately, connections have been consistently failing the last couple days. Anyone seen this? Any tips on what I might need to do to get it to work? Mrkstvns (talk) 21:13, 25 August 2024 (UTC)
- See COM:CropTool, the tool should be up now, but it seems like Toolforge had issues over the weekend. Sohom (talk) 14:53, 26 August 2024 (UTC)
Dark mode fix on Main Page?
editI've pushed some dark mode fixes on Template:Main Page Template, but it's not being reflected in Main Page for some reason? However it is being reflected in other languages like Tamil (முதற் பக்கம்). Is there a reason why? —Matrix(!) {user - talk? - uselesscontributions} 05:21, 26 August 2024 (UTC)
Upland Wizard bug
editI uncovered a bug in the Upload Wizard. Can it be reported somewhere locally, or should I use Phabricator? The Upload Wizard main page and Upload Wizard FAQ are both silent on bug reporting. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:49, 26 August 2024 (UTC)
- The correct place on Wiki would be Commons talk:WMF support for Commons/Upload Wizard Improvements. But in this case this is not an UploadWizard problem. We have an template with the same name as a language code. That is an error that should never happen and we need to delete the template redirect. GPSLeo (talk) 15:07, 26 August 2024 (UTC)
- I moved all uses of the template redirect to the actual template and deleted the redirect. But I am unsure if the language code handling will work without any additional action. GPSLeo (talk) 15:29, 26 August 2024 (UTC)
- Thank you. Did you check whether any of those were genuine attempts at using the language? Looks like we need to replicate {{En}}, {{De}} etc., for the
abr
language code. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:34, 26 August 2024 (UTC)- Yes, I checked the files before changing the template. 6 files wanted to use the language code. I create the {{Abr}} and it seems to work correct. GPSLeo (talk) 15:44, 26 August 2024 (UTC)
- Thank you. Did you check whether any of those were genuine attempts at using the language? Looks like we need to replicate {{En}}, {{De}} etc., for the
- I moved all uses of the template redirect to the actual template and deleted the redirect. But I am unsure if the language code handling will work without any additional action. GPSLeo (talk) 15:29, 26 August 2024 (UTC)
Subcategory not seen in a category
editThe Category:Books from the United States is categorized in Category:Books by country, but it is nowhere to be seen in that category. What is the reason? -- Jan Kameníček (talk) 16:53, 26 August 2024 (UTC)
Tech News: 2024-35
editLatest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.
Feature news
- Administrators can now test the temporary accounts feature on test2wiki. This was done to allow cross-wiki testing of temporary accounts, for when temporary accounts switch between projects. The feature was enabled on testwiki a few weeks ago. No further temporary account deployments are scheduled yet. Temporary Accounts is a project to create a new type of user account that replaces IP addresses of unregistered editors which are no longer made public. Please share your opinions and questions on the project talk page.
- Later this week, editors at wikis that use FlaggedRevs (also known as "Pending Changes") may notice that the indicators at the top of articles have changed. This change makes the system more consistent with the rest of the MediaWiki interface. [9]
Bugs status
- Editors who use the 2010 wikitext editor, and use the Character Insert buttons, will no longer experience problems with the buttons adding content into the edit-summary instead of the edit-window. You can read more about that, and 26 other community-submitted tasks that were resolved last week.
Project updates
- Please review and vote on Focus Areas, which are groups of wishes that share a problem. Focus Areas were created for the newly reopened Community Wishlist, which is now open year-round for submissions. The first batch of focus areas are specific to moderator workflows, around welcoming newcomers, minimizing repetitive tasks, and prioritizing tasks. Once volunteers have reviewed and voted on focus areas, the Foundation will then review and select focus areas for prioritization.
- Do you have a project and are willing to provide a three (3) month mentorship for an intern? Outreachy is a twice a year program for people to participate in a paid internship that will start in December 2024 and end in early March 2025, and they need mentors and projects to work on. Projects can be focused on coding or non-coding (design, documentation, translation, research). See the Outreachy page for more details, and a list of past projects since 2013.
Learn more
- If you're curious about the product and technology improvements made by the Wikimedia Foundation last year, read this recent highlights summary on Diff.
- To learn more about the technology behind the Wikimedia projects, you can now watch sessions from the technology track at Wikimania 2024 on Commons. This week, check out:
- Community Configuration - Shaping On-Wiki Functionality Together (55 mins) - about the Community Configuration project.
- Future of MediaWiki. A sustainable platform to support a collaborative user base and billions of page views (30 mins) - an overview for both technical and non technical audiences, covering some of the challenges and open questions, related to the platform evolution, stewardship and developer experiences research.
Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.
Addition of Pattypan Userguide on Commons
editI and a colleague are curating a complete guide for using Pattypan for batch uploads for Wikimedia Commons, from image quality to upload processes to license and creations of hyperlinks copyright approval. I need help understanding how we can upload here on wikimedia for use by everyone Lutarchitecture (talk) 00:44, 27 August 2024 (UTC)
- You could complete Commons:Pattypan?
∞∞ Enhancing999 (talk) 15:19, 27 August 2024 (UTC)- What do you mean by complete Pattypan, please throw more light thanks, like give it more context so I effectively understand you Lutarchitecture (talk) 21:14, 27 August 2024 (UTC)
- That page and its subpage are the guide of Commons for the use of Pattypan.
∞∞ Enhancing999 (talk) 21:17, 27 August 2024 (UTC)
- That page and its subpage are the guide of Commons for the use of Pattypan.
- What do you mean by complete Pattypan, please throw more light thanks, like give it more context so I effectively understand you Lutarchitecture (talk) 21:14, 27 August 2024 (UTC)
Search operators for audio file size and duration?
editBecause the gallery of new files (linked in the left panel that is shown on all pages under "Latest files") is broken for audio files since it's very cluttered with lots of pronunciation files and not having any way to filter files out within the page, I created this search that to some extent filters out pronunciation files: -intitle:/LL\-Q/ -Pronunciation -deepcategory:"Tamil pronunciation" -deepcategory:"Dutch pronunciation" -deepcategory:"English pronunciation" filetype:audio
There are two problems with it: deepcategory search operator does not work properly so the parent category can't be used (phab:T369808) and it's quite slow because the "intitle" search operator can't deal with "LL-Q" because of the hyphen needing this regex which slows the search down (phab:T371195). It's still good enough that it can be used in practice to see recent audio files that aren't pronunciation files so e.g. audiobooks, spoken Wikipedia, soundscapes, and music tracks as well as lots and lots of copyvios that no human or bot seems to check. To improve it further, I'd like to add a search operators to exclude files with a duration of only a few seconds or a filesize of just a few kB because these are usually also just pronunciation files. It seems like petscan can do so so is there a way to add this to the WMC search? If such search operators don't exist is there some phabricator issue about adding these? They would also be useful in finding for example subcategorize Short films videos in Films categories and many other things. Prototyperspective (talk) 10:16, 28 August 2024 (UTC)
- @Prototyperspective: You can search by file size in kilobytes using
filesize:
. See mw:Help:CirrusSearch#File properties search. But there doesn't seem to be a property for duration. You can see what the search system can use by appending?action=cirrusdump
to a page URL, and trying that on a random audio file I can't see anything that looks like the duration. It seems to me that duration would be an obvious measure alongside height and width, but I'm not sure where I'd suggest that. --bjh21 (talk) 11:22, 28 August 2024 (UTC)- Ups, thanks apparently I only ctrl+F searched for "file size" with a space but not "filesize" on that help page. I added
filesize:>100
to the search string. - I'll make a phabricator issue asking about a search operator for duration then because even if that's not needed for filtering out pronunciation audio files it would still be useful for finding short films and many other use-cases I haven't thought of. Prototyperspective (talk) 12:54, 28 August 2024 (UTC)
- This
?action=cirrusdump
thing seems very useful so I think it should be included in the help page properly, could you add it there (currently it's only buried in the notes of some reference). Another issue is that I can't find out how to specify the sort order with the search operators (asked about it at Template talk:Search link) – how could one specify sort=create_timestamp_desc as sort order? Prototyperspective (talk) 13:00, 28 August 2024 (UTC)
- Ups, thanks apparently I only ctrl+F searched for "file size" with a space but not "filesize" on that help page. I added
How to distinguish categories set by the Infobox from other cats?
editIt would be great if somebody could create a report that lists categories that don't have any categories except for meta-categories set by the Wikidata Infobox like Category:Uses of Wikidata Infobox with no image as proposed here. Example
Is there some way to distinguish categories set by the Infobox from other categories?
It would be best if one could also distinguish between meta-categories set by the WD Infobox like the one above and other categories set by the WD Infobox like Category:Lonnie (given name) (from example) because some categories that only have categories set by the WD Infobox don't need any other categories so would best show up in some separate less important report at some point (and it may even be desirable to increase cats that only have cats set by their WD Infobox). Prototyperspective (talk) 19:47, 30 August 2024 (UTC)
Cat-a-lot performance, maintenance
editCat-a-lot seems very slow, since a few days. For example It takes 8 mins to edit a batch of 500 files (locking the tab). Can this be confirmed to be a server or a scripting issue and checked and fixed for speed. rollback ? It should not be my bandwith, but maybe advice on a local setting? Thank you Peli (talk) 12:21, 1 September 2024 (UTC)
- I have the same experience. Very slow. Wouter (talk) 12:51, 1 September 2024 (UTC)
- Please see this thread. Prototyperspective (talk) 21:59, 1 September 2024 (UTC)
Tech News: 2024-36
editLatest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.
Weekly highlight
- Editors and volunteer developers interested in data visualisation can now test the new software for charts. Its early version is available on beta Commons and beta Wikipedia. This is an important milestone before making charts available on regular wikis. You can read more about this project update and help to test the charts.
Feature news
- Editors who use the Special:UnusedTemplates page can now filter out pages which are expected to be there permanently, such as sandboxes, test-cases, and templates that are always substituted. Editors can add the new magic word
__EXPECTUNUSEDTEMPLATE__
to a template page to hide it from the listing. Thanks to Sophivorus and DannyS712 for these improvements. [10] - Editors who use the New Topic tool on discussion pages, will now be reminded to add a section header, which should help reduce the quantity of newcomers who add sections without a header. You can read more about that, and 28 other community-submitted tasks that were resolved last week.
- Last week, some Toolforge tools had occasional connection problems. The cause is still being investigated, but the problems have been resolved for now. [11]
- Translation administrators at multilingual wikis, when editing multiple translation units, can now easily mark which changes require updates to the translation. This is possible with the new dropdown menu.
Project updates
- A new draft text of a policy discussing the use of Wikimedia's APIs has been published on Meta-Wiki. The draft text does not reflect a change in policy around the APIs; instead, it is an attempt to codify existing API rules. Comments, questions, and suggestions are welcome on the proposed update’s talk page until September 13 or until those discussions have concluded.
Learn more
- To learn more about the technology behind the Wikimedia projects, you can now watch sessions from the technology track at Wikimania 2024 on Commons. This week, check out:
- Charts, the successor of Graphs - A secure and extensible tool for data visualization (25 mins) – about the above-mentioned Charts project.
- State of Language Technology and Onboarding at Wikimedia (90 mins) – about some of the language tools that support Wikimedia sites, such as Content/Section Translation, MinT, and LanguageConverter; also the current state and future of languages onboarding. [12]
Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.
upload wizard for books
editIs there a campaign interface (Upload Wizard configuration) that fills in {{Book}} instead of {{Information}}?
It could make it easier for people to understand files like the ones in the Chinese categories.
∞∞ Enhancing999 (talk) 12:30, 3 September 2024 (UTC)
Audio of music contain copyvio thumbnails
editThe thumbnails are not showing up at the audio file but the thumbnail is embedded in them. However, they are embedded in the file and when downloading the file one can see or extract them. Example.
- Many of these thumbnails are copyrighted. This means usually the thumbnail would need to be removed. video2commons already imports audio files without the thumbnails. Could there be some script or bot that categorized all audio files with a thumbnail set into e.g. Category:Audio files with embedded thumbnail?
- Then as a next step one could remove all of them at scale and efficiently using some metadata removal tool, for example similar to command
eyeD3 --remove-all-images **/*.opus
(applied to all audio files in some category). I guess it would be best to not remove the thumbnail for identified cases where the thumbnail is CCBY as well, these could e.g. be moved to another category or audio files whose thumbnails should be removed to a subcategory of the category above. (A more sophisticated method would be to reverse image search each thumbnail for finds via tineye so only non-original works are deleted and thumbnails created by the person licensing the work under CCBY kept (if the CCBY license also applies to the thumbnail) but I don't think this would be necessary as it would cause a lot of manual work of checking whether it's indeed a copyvio and whether thumbnails without reverse search result are indeed not copyvios.)
Just as a note: the audio files of the example display 0:00 as duration instead of the duration which only shows after one has clicked play. Prototyperspective (talk) 00:07, 4 September 2024 (UTC)
Files still in category but categoryname no longer in wikitext
editHi there, I am hoping someone can help me out with the following: I attempted to move the files starting with inventory numbers starting with M in this category to a sub-category which was specifically designated for that upload from one of our partners (for metrics and outreach purpose). I used cat-a-lot, but it reported it was 'unable to move files to category because old category name does not exist'. After which I used open refine to change the categoryname in the wikitext. This worked and moved them to the designated category, but for some reason the files also still appear in the original category, eventhough the original category name is not present in wikitext of the files anymore. I have tried to find the solution for this myself but am at loss and really hope someone here can point me in the right direction towards solving this! Thank you so much, any help with this is greatly appreciated.. MichellevL (WMNL) (talk) 14:51, 5 September 2024 (UTC)
- The category is added by template: Template:Universiteitsbibliotheek Maastricht
∞∞ Enhancing999 (talk) 15:13, 5 September 2024 (UTC)