Details
-
New Feature
-
Status: Open (View Workflow)
-
P3
-
Resolution: Unresolved
-
None
-
None
-
None
-
-
Very Small (VS) < 1 day
-
Medium < 5 days
-
-
1
-
R2
-
R5
-
R5
-
R2
-
R4
-
R5
-
R2
-
R1
-
R1
-
R5
-
R1
-
R2
-
R5
-
R4
Description
Purpose: The purpose of this UXPROD is to capture the need to deal with sorting and diacritics. This isn't a Swedish problem, but a more general problem surfaced by Theodor in the context of Swedish (Chalmers). In addressing this UXPROD, we need to look at the problem holistically. Should also look into the other, related linked issues (see links).
Below are the details from the original bug (UISE-68). Lot's of good discussion can also be found in that bug's comments:
Original Issue Summary: Codex search treats Swedish diacritics as ascii equivalents
Overview: When conducting title level searches in Codex for titles containing Swedish diacritics (å,ä,ö) the search behaves as if those characters are reduced to their ASCII equivalents (a,o).
Steps to Reproduce:
- Create a couple of records in Inventory with titles starting on a, å, ä or similar
For example:
"Den aktansvärda"
"Den äkta varan"
"Den åländska skärgården"
"The Åland archipelago"
"Ålöndska skärgården"
"The Aland archipelago"
- Go to Codex and conduct a title search for åland
Expected Results:
The title "The Åland archipelago" is showing.
(Another form of expected result is that also "Den åländska skärgården" is showing since "åländska" is a form of "åland" that Swedish stemming algorithms might be able to catch.)
Actual Results:
"The Aland archipelago" is returned together with the above and a few other items containing the string "aland". Se attached image.
Additional Information: Will add these in separate issues.
This particular issue might get solved by changing Collation on relevant tables in Postgres to Swedish (see https://www.postgresql.org/docs/9.1/static/collation.html), but I believe that this issue is related to a bigger discussions on search technology
TestRail: Results
Attachments
Issue Links
- clones
-
UISE-68 Codex search treats Swedish diacritics as ascii equivalents
-
- Closed
-
- is blocked by
-
FOLIO-1246 Implement Postgres Full Text Search functionality
-
- Closed
-
-
MODCXMUX-25 sort according to tenant's locale
-
- Open
-
-
MODINVSTOR-148 sort according to tenant's locale
-
- Open
-
-
STCOM-78 Use new CQL sort-modifiers to specify locale for collation
-
- Open
-
- is duplicated by
-
FOLIO-850 Locale-specific sorting
-
- Closed
-
- relates to
-
RMB-37 SQL sorting/comparing must use the tenant's locale/collation
-
- Draft
-
-
UISE-69 Codex search results treats Swedish diacritics as ascii equivalents when sorting results
-
- Closed
-
-
UISE-70 Codex search results are taking Nonfiling characters into account when sorting
-
- Closed
-
-
BF-264 Sorting of contributor types ignores spaces
-
- Closed
-
-
FOLIO-1955 Create databases using und-x-icu collation
-
- Open
-
-
UIIN-264 In edit mode of the Instance record. The contributor type list is to be sorted like LoC's Code List for Relators
-
- Closed
-
-
UISE-80 Search results are not sorted by title
-
- Closed
-
-
UIU-1726 User app search: observe and interfile diacritical marks in Swedish, German, French, Spanish, Portuguese, Russian.
-
- Draft
-
-
UXPROD-1045 Fulltext Search
-
- Closed
-
-
UXPROD-1135 Locale-driven search
-
- Open
-