Skip to content

ENH: add grouping (letter display) for pairwise comparison - rebased #9493

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

josef-pkt
Copy link
Member

rebased version of PR, closes #3674

second commit uses unit test from previous PR, adjustments for unique ordering and labeling.

@pep8speaks
Copy link

Hello @josef-pkt! Thanks for opening this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 61:19: E124 closing bracket does not match visual indentation

k_letters = len(all_)
letter = [] * k_letters
for i in range(k_groups):
ss = ''.join(alphabet[j] if i in set_j else ' ' for j, set_j in enumerate(all_))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for visualization it looks better if there is a space between letters

Parameters
----------
edges : array_like, 2-D, or None
array or list of lists with edges in rows and two.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfinished sentence?
Do we have unit tests for using array?

@josef-pkt
Copy link
Member Author

josef-pkt commented Jan 27, 2025

funny detail with current code

When specifying the distance matrix, and a diagonal element is nonzero (1), then it does not form a clique with itself. No letter is assigned to group (if it also has nonzero distance to the other groups).
Found because I made a mistake in a distance matrix example.
Is this a bug or a feature?

Insufficient input validation?

As aside:
I used frozensets, because I used a set of sets, which requires hashable frozenset inside.

@josef-pkt
Copy link
Member Author

problem with pandas html in notebook
print looks fine, but html strips all the leading and trailing whitespace in the letters, so that it's just a single column of letters.

print(group_frame)
      name      mean  ci_sim_low  ci_sim_upp letters
0   France  3.983286    3.835050    4.131523   a    
1  Germany  4.867173    4.726966    5.007380     b  
2    Japan  4.102950    3.955065    4.250834   a    
3   Sweden  4.113893    3.979916    4.247869   a    
4      USA  6.317059    6.187845    6.446273       c

We need to add "-" for empty letter position, optionally

(The above is from an adjusted version of the cylinder example where also Germany differs from rest.)

sorted by mean:

frame_s = group_frame.sort_values("mean")
print(frame_s)
      name      mean  ci_sim_low  ci_sim_upp letters
0   France  3.983286    3.835050    4.131523   a    
2    Japan  4.102950    3.955065    4.250834   a    
3   Sweden  4.113893    3.979916    4.247869   a    
1  Germany  4.867173    4.726966    5.007380     b  
4      USA  6.317059    6.187845    6.446273       c

@josef-pkt
Copy link
Member Author

josef-pkt commented Jan 28, 2025

It looks like relabeling integer indices is quite easy.
uses standard method to get group indicators to and from dummy matrix

e.g. change label of first two groups

a = np.arange(3)
b = np.array([1, 0, 2])
xx = np.repeat(a, 3)
xx, a, b
(array([0, 0, 0, 1, 1, 1, 2, 2, 2]), array([0, 1, 2]), array([1, 0, 2]))

(xx[:, None] == a) @ b
array([1, 1, 1, 0, 0, 0, 2, 2, 2])
# or (which should also work if `b` is a sting array)
b[np.nonzero(xx[:, None] == a)[1]]
array([1, 1, 1, 0, 0, 0, 2, 2, 2])

I did not manage to get a linear algebra row permutation matrix to work.

update
seems to work also for 2-dim instead of 1-dim:
So, we could convert both pair indices at the same time, but I think we are not really gaining anything.

xx2 = np.column_stack((xx, xx[::-1])) 
(xx2[..., None] == a) @ b
array([[1, 2],
       [1, 2],
       [1, 2],
       [0, 0],
       [0, 0],
       [0, 0],
       [2, 1],
       [2, 1],
       [2, 1]])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy