Packaging/2020-05-05-pip¶
Legacy Wiki Page
This page was migrated from the old MoinMoin-based wiki. Information may be outdated or no longer applicable. For current documentation, see python.org.
Resolver Dev Syncup (5 May 2020)¶
Participants:
Paul
Pradyun
Tzu-Ping
Agenda?
- YAML tests
They go through both the new and old resolvers by default (either can be disabled)
Checking output message - can’t do that
Do what YAML is good for; use regular tests for what it’s not
Pre-installing package before running the test body
Workaround? is to do a multi-stage test (which we can do)
Can’t do editable (probably will be added later)
Some tests are very procedural, probably not likely to be translatable to YAML
https://hackmd.io/niav7TZET4SxHR8G0l56PA – an example
[then… everyone gets into co-working mode]
pip UX and resolver team meeting note pad¶
(including May 5th and 7th collab time)
7 May collab time
* What do we want to focus on?
Bernard: are the notes below correct
TZP: would like to focus on the topic from the last time
Pradyun: would like to make decisions about UX-related pip issues https://github.com/pypa/pip/milestone/10, but today might not be the right time since the decisionmakers aren’t in the call. Sumana asked to make that list
Something that could use funding:
build logic, reusable libraries for build logic and install logic?
pip UX and resolver team meeting note pad
https://github.com/pypa/pip/blob/master/tests/functional/test_new_resolver.py
https://github.com/pypa/pip/blob/master/tests/yaml/install/conflicting_triangle.yml#L1
5 May 2020
Pradyun and Nicole and Bernard
(with additional notes on 7 May in conversation among Bernard, Tzu-Ping, and Pradyun)
Dependencies¶
For pip resolver work we have 2 kinds of deps - build time and install time
build time¶
I need this thing Y so I can build Z
Install (run) time¶
Package X need this package Y when i run so that I work (needing requests to make network calls)
The dependency resolver project is delaing with install dependencies, not build time deps.
(The team is interested in gathering build time deps from research, but its not the *current focus*)
Pradyun has been working on how to visual dependencies.
conflicting cases - https://pradyun.github.io/scrap/four.html
what’s more complicated is visualing longer dependency chains
The team don’t know how to visual the complicated situations
The team doesn’t know all the ways we can end up with conflicts, therefore its hard to give users useful feedback/notifications
Research opportunity - research with users
Exceptions¶
There are 3 main examples.
1. Resolution impossible: this conflict is not logically resolvable. Every choice we can make has a conflict. So the error message can be clear and specific.
Some algos exist to solve this exception - one is Pubgrub. Pubgrub is doing dep. resolv. it’s maintaining a “graph of the status of compatibilities/incompatibilities”. It then generates the error messages based on the final graph. Examples of the outputs here: https://github.com/dart-lang/pub/blob/master/doc/solver.md#examples and blogpost here: https://medium.com/@nex3/pubgrub-2fb6470504f
Right now pip isn’t maintaining this pubgrub-esque compatibilities/incompatibilities graph. It is doable, but not there.
Research questions
What is useful context (what the pip resolver did internally to get here) for the user?
what is possible to display?
rounds is not the right metric (is this relevant to the user?)
this must be relevant to the user!
What information do we display to them?
How specific does the error message content need to be?
How do we present error messages for resolution impossible?
Who do we cater the messages to?
Possible design deliverables¶
Error message content¶
We can create different levels of message
The “ideal message”
The “this is better than no message”
The “this is not helpful” message!
Terminology¶
pinning dependencies
Documentation¶
Document resolution strategies (Use these recommendations to users as possible ways to resolve them?)
Documentation on recommendations on pinning dependencies (e.g. you’re less likely to see this issue if you have less pinned package versions) There are blogposts, can they be rolled into “if you don’t know what to do, start with these options and these are the reasons why” trustable, good documentation. https://caremad.io/posts/2013/07/setup-vs-requirement and https://packaging.python.org/tutorials/managing-dependencies/
The existing documentation dives too much/early into abstract and philosophical topics
The desired documentation should be more “usage oriented”
2. Resolution too-deep
This is computational expensive and takes a lot of time due to large search space. This means the resolver doesn’t really know why it’s failed. We’re not sure how big an issue this is for users, and we’re not sure of what usecases cause this exception.
So far there is only 1 reported case of too-deep.
The answer will probably be vague, due to the unknown space.
Here are some things we could do:
“I spent a lot of time resolving these deps, package A and B caused most problems, maybe you should focus on those”
Collect numbers after successful and failed resolutions from users, so we know what is the better metric
user-specified requirement count
total discovered requirement/candidate count
round count
backtrack count
this could be displayed after pip has finished the resolution
(Explanation: we really dont know the causes, maybe we should show stats ^ above. And let users then report those to the team. This wouldn’t really help the user resolve the issue directly, but they can report them to the pip team and start the troubleshooting. If the team then gets enough understanding then possibly we can display the “useful” debugging information)
Something along the line of pytest reports might be a good way
Collect data for sucessful resolutions
seeing what the difference between successful and failed resolutions
also statistics as above
thought: we don’t knowwhat the reaction from the user would be to this
pradyun: there are no ways to get those stats: transparent, clear, consentual
this is done in debian’s popularity contest (https://popcon.debian.org/), homebrew (https://formulae.brew.sh/analytics/). TZP reminds us that the way homebrew did it was controversial (https://news.ycombinator.com/item?id=11566720)
participant recruitment¶
This exception will happen in projects which are complex and have a lot of dependencies
Research questions
what do users need (and maybe want) in this case?
what should the resolver display as its doing the resolution dependency?
no/lots of logging
what should the resolver display when the resolution dependency is finished?
3. Requirement inconsistent: a build time bug. we can ignore this.
Problem statement¶
There are 2 exceptions - 1 and 2 above. Each has it’s own need to 1) compute the error (figure out what information to present and 2) then present the information.
pip is able to work out which exception has occured.
The question is how do we compute the error message?
To provide an useful message to the user, it would be easier to know (otherwise we’ll provide too many solutions - therefore being useless to fixing the problem) what the complications are (there is always more than one cause to the problem).
presenting the user:
with the conflicts
with the candidates that had conflicted requirements
If I ran out of milk and want to buy milk, to buy some I have to find my car key. Is the error I can’t find my key, or i have no milk?
Goal: get milk
Tasks:
go to the shop
subtask
find my key
drive to the store
buy milk
Example 1. Error: could not get milk, need car, cannot find car key
less detail: could not get milk, cannot find car key (i’ve inferred I know i have to drive)
more detail: could not get milk, need trip to store, need transportation method, need car, need car key, cannot find car key
Example 2. Error: could not get milk, need car, cannot find car key. what about using bike?
Pradyun: we don’t know if bike exists
Pradyun: the direction we’re going in is doing what pubgrub does.
milk deps on transport and shop (which is pypi)
transport is satisified by bike or car or scooter or X
shop is avaiable or not avaialable
Resolver: i can’t give you milk, because i can’t give you transport
Building analygous reporting logic in pubgrub and using that instead
TZP: lets start with the simplest error message, gradually fill it out based on user feedback - “Error: could not get milk, cannot find car key”.
examples pip get milk
could not get milk, need to go to store, cannot find car key
could not get milk, need car key, cannot find car key
Questions:
do we display all the things in the dependency chain? (i think we do this)
what level of detail do we need to go to when displaying the dep. chain?
In order to figure out what information is to be presented to the user, we would investigate the conflict and choose one of our tools from our “toolbox” - what information to display.
For 1 and 2 we don’t now what caused the error in most situations - we need to point them in the general direction of it. what we dont know is a) how to point and b) how to figure out how to compute those pointers.
We need to get the resolver fail cases in order to work out what data we can expose to the user in those cases and then devs work out whats possible to do.
Failure cases¶
How do we get them? They need to come from users ideally.
Expert approach¶
Use metadata
I’d try and use an older version of the package to test if it works
patch the package
swap out the package for something else
Possible design deliverables¶
Error message content¶
We can create different levels of message
The “ideal message”
The “this is better than no” message
The “this is not helpful” message!
Documentation¶
Document resolution strategies
Use these recommendations to users as possible ways to resolve them?
General research questions¶
How do we present “vague failure” messages to users? Essentially “it’ll take us too much time to figure out why it’s failed.
How would you go about resolving situation 1 and 2?
What do other package managers do?
npm doesn’t have this issue
apt-get?
Project questions¶
Is implementing pubgrub/or similar algo in scope for the project?
Raise to Sumana how successful the synch today was with Pradyun
Dependency resolution and PubGrub¶
Nextsteps¶
Identify research participants to research this problem with
Method(s) of research
Method 1
we contact users, ask them to try building their project/application with the new resolver (- send questionnaire to 50 users, ask them to self-select as users with “complex and have a lot of dependencies”?)
we then ask them to email us any resolution errors they have
then we talk to them
email a pilot
TODO: Nicole will document method 1
TODO: bernard to look thru ~200 participants and identify users with “complex and large dependencies”
TODO: write some questions to “get a shape” of how complicated the project is
TODO: find out i there a comman/package to run to work out number of deps in a project/how complex?
https://stackoverflow.com/questions/42237072/list-dependencies-in-python seems to a solution
send questionnaire to 50 users, ask them to self-select as users with “complex and have a lot of dependencies”
Method 2
is there a method 2!?