The identity of story points depends on what information you have. If you don't know your team's velocity then story points are only relative complexity. Once you have your team's velocity you can use that information to convert to time.
What is relative complexity? How can you compare the complexity of <changing the colors of one button> with <implementing a sorting algorithm>, other than by how long they might take?
Then it would be how long they might take relative to each other. To give a specific time you would have to know the stage of development for the product, the health and ease-of-use of the CI/CD pipeline, the current interviewing burden of the team, the rate at which production incidents are occurring, the number and length of meetings that the team members are in, etc., etc., etc. More junior developers generally won't do a great job of that. But more junior developers can usually estimate the time it would take them to do the task if they had absolutely nothing else to do and if their build/test/deploy pipeline were optimal. So with story points that's all they need to really worry about. In that way story points are a tool to help the team make better estimates in the face of varying degrees of expertise in individual contributors estimating time-to-complete.
By judging complexity and measuring velocity you get an estimate of time that intrinsically takes all of the variables into account. It's a powerful tool when used right.
> To give a specific time you would have to know the stage of development for the product, the health and ease-of-use of the CI/CD pipeline, the current interviewing burden of the team, the rate at which production incidents are occurring, the number and length of meetings that the team members are in, etc., etc., etc.
This is a strawman. When asked to estimate a task, people essentially always think in terms of "how long would it take if this were the only thing I was working on". Of course, when a junior dev gives an estimate like this, you don't put it into a Gantt chart and start planning release celebrations based on it: you add appropriate buffers and uncertainty based on who made the estimate.
> By judging complexity and measuring velocity you get an estimate of time that intrinsically takes all of the variables into account. It's a powerful tool when used right.
Again I ask, what is complexity, other than an estimate of time taken?
Also, "velocity" is just an average across people and sprints. This would only converge to a meaningful estimate of time IF people are consistently failing their estimates in the same way. If the error bar on the estimates varies wildly, taking the average doesn't do anything meaningful. I think this is much more common than consistently miss-estimating in the same way.
Not to mention, if these estimates of "complexity" don't take into account external factors, then they'll always be off by unpredictable amounts. Velocity measurements also fail to take this into account - so, when the team had a bad sprint because a member fell ill, or because there were extended disk failures, or whatever other external event, that goes into the velocity, as if this is some recurring event.
I did them too, with two different teams of 8-10 people, once for two years and the second time for about one year. We didn't lose anything when we gave up on points and simply went for time based estimates.