Introducing the SONAR Score

While all of the focus over the last month or so has been on the Phillies quest to repeat as World Series champions, I’ve been spending an inordinate amount of time refining and working on a system that I’ve been building for 2 years now to analyze the performance of minor league players. The system has gone through a bunch of changes, there are aspects of it that I’m still not 100% happy with, but right now, its at a place that I like, and the last changes made to it will come over time, as I figure out better ways to improve it. And like all cool systems, it needed a name, so I give you the SONAR Score system, a system that tries to dig beneath the surface (Sonar, get it) to analyze minor leaguers. So, check below the fold for more.

Statistically Objective Neutralized Analytical Report score. Cheesy? Sure. But I’m not very creative, and this is actually better than some of the other crap I came up with.

This post is going to serve as the outline of the system, what purpose it serves, and how I came up with what I did. I’m hoping that this helps to explain everything about the system, and I’ll make sure to put the content below along with the spreadsheets and statistics related to the system for reference.

Premise

Every year, MLB teams draft 50+ guys in June, which if you do the math, comes up to about 1,500 players. These players join the thousands of guys already in the minors, playing across 10 full season leagues and another 6 short season leagues. These leagues are filled with elite prospects, career minor leaguers, washed up veterans who were good 5 years ago, and just about everything else in between. As I’m sure you know, not every league is the same, not every level of baseball is the same, and within leagues, not every park is the same. The minor leagues serve as a learning ground for young players, a place to work on your flaws, develop new pitches, hone your skills, and generally prepare yourself for the major leagues. For some guys, its a way to earn a paycheck, and a delay, for however long, entrance into the regular world and holding down a regular job. Minor league statistics, as you may have heard, are not really always the best indicator of the type of player a prospect is going to become. Some guys put up seemingly pedestrian numbers for 2 or 3 seasons, then bam, they explode and become elite prospects. Some guys enter pro ball with a boatload of tools and hype and then 3 or 4 years later people say “he was a 1st round pick? Really?” Some guys spend years in the minors, logging 4,000 AB’s or more before they get a shot in the big leagues. Some guys spend only a few months in the minors. My goal when I started out was to be able to look at a minor league player’s numbers and immediately understand how he stacks up based on what league he is in, how old he is, and what home park he plays in, but do so by focusing mainly on his core abilities, not his batting average, home runs, RBI’s, or stolen bases.

What’s under the hood

For different reasons, I’m not going to spell out my formulas right now, but I’m going to give some bullet points on what the system looks at and why I chose the factors that I did.

* The final product, whether it be a pitcher or a hitter, is the SONAR score number. The higher the score, the better the prospect. I’ll give the scale at the end of entry. The number is not a counting stat like HR or RBI, and its not a true average stat like OB%. Its kind of a composite stat.

* All players have their numbers adjusted based on a few key factors. Those are; age, level, home park, and number of plate appearances or innings pitched. Age and level should be self-explanatory. Older prospects in lower leagues need to have their numbers adjusted downward, just like really young prospects in more advanced leagues need to have their numbers adjusted upward. Adjusting for park is slightly tougher, because park factors in general, especially minor league park factors, are very sketchy. A three year weighted average was used here, which is more reliable than a 1 year number, but its still not perfect. If you asked a scout which minor league parks were hitters parks and pitchers parks, I’m sure he could tell you, but measuring it is difficult. Even MLB level park factors sometimes seem kind of curious, but I knew that I needed to have some kind of adjustment here, or you’d end up with tons of guys from Lancaster and High Desert at the top of the list every year. The other adjustment, for PA’s and IP, might seem less obvious, but its one that I think is somewhat important. Baseball is a game of sample sizes, and if you look at a set of statistics, you need to evaluate how reliable the sample is. A sample size of 20 AB’s is worth a whole lot less to me than a sample of 500 AB’s. So players with low PA/IP totals see their score adjusted downward, simply because their sample is less reliable/subject to more uncertainty. Its a subtle thing, but something I wanted to add in.

So why is this new statistic important?

Thats a good question, and its one that I’m not sure I know the answer to, but that’s part of the reason in me wanting to do something like this. When you look across Major League Baseball, most every superstar in the sport was a big tools guy in the minors. Most of these guys were big scouts guys. For elite guys, there isn’t much to it. You could watch a guy like Cole Hamels dominate minor league hitters and know that he was going to make it as a big league pitcher. Lots of guys are like this. You didn’t need a fancy stat to tell you Albert Pujols was going to hit. Or that Joe Mauer could hit, or that Yadier Molina had a strong arm and would throw out his share of baserunners. But at the same time, there are lots of guys who come into pro ball with glowing scouting reports, they are ranked near the top of prospect lists because of these reports, and they never make it. Why don’t they make it? Why do some guys seemingly come out of nowhere and turn into great big league players? Is there a secret formula? Well, that was kind of my hope, that I’d run all of these numbers through a system, and it would identify guys that maybe the big publications weren’t focused on, and when those guys excelled, I could say “hey, I saw that coming” or something like that. When I started putting this idea together a few years ago, my preliminary system loved Brett Anderson, the young lefty for the A’s. Scouting reports on him weren’t great, most of the publications said he might be a #4, maybe a #3 if things bounced right. He scored really well in one of my early test runs, and was a guy I kind of stowed away in the memory bank. He had a big season last year in the minors, and was amazing at times this year, flashing plus stuff in the majors while basically learning on the job. Could he still only be a #4? Sure, but those same publications that were lukewarm on him a few years ago are now raving about him and talking him up like a future #1. Those are the types of situations I was hoping to unearth, guys who maybe are flying under the radar now, but who are displaying the skills needed to succeed at the next level.

Which brings me to something that I also need to address. This is a system without bias. Its strictly numbers based, with the adjustments I mentioned above. This system will not favor Phillies prospects, or prospects of any team. This system also does not replace a scouting report. A guy could be dominating Low A hitters while only throwing 88-89 mph because he has a ridiculous curveball that inexperienced hitters have no shot at hitting. That’s where its nice to know that Player X only throws 88, because generally righties that only throw 88 tend to struggle. But rather than casting this player aside because of his velocity, you take note that he has dominated at a lower level, and that he’ll need to prove it at a higher level. Should this player find a way to add 2 or 3 mph to his fastball, his core peripherals indicate he already has the ability to get hitters out, and his prospect status could rise. My system is limited, obviously, because the system does not know whether Pitcher X throws hard, or whether Batter Y has any physical projection left. It also doesn’t know whether a guy was a 1st round pick or a 50th round pick, but that’s kind of the point. My system is 100% objective, its up to the reader to determine whether a player’s score is too high or too low based on his scouting report.

There will be guys who post really high scores and then do nothing next year. There will be guys with negative scores (more on the scale later, I promise) this year who have big seasons next year. Its going to happen. Every publication that ranks prospects hits and misses on tons of guys every year. I’m not going to tell you that my info is more important than the info you can get from Baseball America, Baseball Prospectus, or anything other site. I’m simply telling you that what I have is a snapshot of how a player has performed, and it considers a lot of different factors. Will it turn out to be useful? I suppose only time will tell.

The SONAR Score Scale

When looking at the list of players, I wanted to try and think of a good way to present the scores and give it some context. Most people who follow the minors know what the 20-80 scale is. When looking at the 5 tools for a player (hit for power, for average, running, throwing arm, fielding), scouts will give a player a rating on each tool from 20 to 80 in increments of 5. 80 is considered elite, absolutely top shelf, very very rare. Think Ryan Howard’s power, Carl Crawford’s speed, and Joe Mauer’s ability to hit for average. A 20, on the other hand, is the absolute bottom of the scale. Think Eric Bruntlett’s ability to hit for average, Pedro Feliz’s speed, and Johnny Damon’s throwing arm. Then what you have is everything in between. A 50 is basically considered major league average. Here is a general breakdown of the 20-80 for practical purposes

80 – Elite, very rare skill
75 – Well above average, borderline elite
70 – Well above average
65 – Above average, borderline well above average
60 – Above average
55 – Average to above average
50 – Average
45 – Average to below average
40 – Below Average
35 – Below average to well below average
30 – Well below average
25 – Well below average to very poor
20 – Very poor

This gives you 13 different classifications. I think its important to create a bell curve type distribution here, simply because it seems logical. For 2009, I considered well over 2,000 position players. Of this sample, only 9 came in with a score of 70 or higher. This makes some sense to me. Sure, there are tons of elite guys in the majors, but not all of them were elite prospects in the minors. You can look at the Phillies current major league team as a prime example. Only Cole Hamels was touted as an elite uber prospect in the minors. There were lots of question marks on Howard, Utley, Rollins and others. If you extend out further, there are a total of 194 players who scored in the 50-80 range. That seems pretty accurate, doesn’t it? That’s 194 players in the minors who put up average to elite numbers. This is the sample of guys who I think should be closely examined. The single biggest chunk scored in the 40 range, below average, and intuitively, that seems right. Most prospects don’t make it, and if they do make it, its usually as a backup/roster filler type. There will always be the guys like Jayson Werth who fall completely off the map, only to re-emerge and turn into stars. But those guys are exceptions to the overall picture.

Using the 13 different categories and the 20-80 template, here is the scale I used when grouping prospects, based on their score. The number in the brackets at the end coincides with the 20-80 scale breakdown above. The hitting scale and the pitching scale are the same. A pitcher with a score of 85 could be compared to a shortstop with a score of 85. Its up to you to determine whether you want to compare hitters to pitchers. I generally prefer to look at them separately, but its a judgment call.

SONAR Score Chart

Score of 110+ = Elite [80]
Score of 95-110 = Well Above Average to Elite [75]
Score of 80-95 = Well Above Average [70]
Score of 65-80 = Above Average to Well Above Average [65]
Score of 50-65 = Above Average [60]
Score of 35-50 = Average to Above Average [55]
Score of 20-35 = Average [50]
Score of 5-20 = Average to Below Average [45]
Score of 5 to -10 = Below Average [40]
Score of -10 to -25 = Below Average to Well Below Average [35]
Score of -25 to -40 = Well Below Average [30]
Score of -40 to -55 = Well Below Average to Terrible [25]
Score lower than -55 = Terrible [20]

I know it should seem obvious, but a score of, say, 65.50 would count in the “65-80″ range, while a score of 64.50 would count in the “50-65″ range. At that point, you’re splitting hairs, and you could put the guy in either group. But I just wanted to make it easy to read. So a player with a score of, say, 70.34 would be considered Above Average to Well Above Average, obviously a very good prospect. Make sense?

It may look confusing now. But I think after you read this entire entry a few times (if you choose to), then it will make sense.

Final Comments

This whole concept, this whole stat, seems simple to me because I’ve spent hundreds of hours working on the project and its second nature. If you just skimmed the whole thing (which I don’t recommend) and you’re just looking for a takeaway, here it is. The score number that you’re going to find is a composite score that is neither a simple counting stat nor a simple percentage stat that attempts to evaluate what a player has done in the current season, based on his core skills. The stat is age, league, park and playing time adjusted. The scale is the same for hitters and pitchers.

As with anything, I’m open to questions/feedback if you take the time to read through this and make sense of it. If you just tell me how dumb it is, or how pointless it is, there isn’t a whole lot for me to say, so I probably won’t. If you have a question and I have an answer to it, I’ll definitely post it. Eventually, all of the scores will be available for all 30 teams, and I’m going to attempt to do a Top 15 prospects list for every team this year, heavily influenced by the scores, just to see how that list fairs with regard to the lists put out by BP and BA.

The score is meant as a frame of reference. It represents what the player did in the current season. Its important, when looking at the player’s score, to also look at other factors which might confirm the high or low score. For this, I think its important to focus specifically on a hitter’s walk rate, his Secondary Average, and his strikeout rate. For pitchers, its important to see how much contact he allows, and how many home runs he’s allowed, especially when considering his park. A hitter who posted a high score could have arrived at that score because he showed exceptional power, but if he’s sporting a 35% K rate and a very low walk rate, that should raise obvious red flags. Conversely, if a pitcher posts a high score but allowed a lot of hits and a moderately high HR rate, that should raise a red flag. I think the pitching scores are going to be slightly more reliable, because it is easier to isolate a pitcher’s true core skills, whereas hitters require a lot more aspects to be examined, and some of those skills are tougher to evaluate.

With all of that said, its time to get to the Phillies prospects. All of the scores and information is collected via Excel, and in the next few weeks I’m going to be adding all 30 teams spreadsheets into Google Documents and then adding a page at the top of the site to access all of this information, for those who find it interesting. I’ve stripped the nuts and bolts used to calculate the player’s score out of the spreadsheet and just included a few basic statistics, as well as the player’s score, the heart of this whole process. Players are sorted by score. I’m going to address the players in groups, starting with the hitters. If the pictures aren’t properly visible for you on your monitor, I recommend right clicking and then opening the image in a separate tab or window. A smaller image would have been too grainy, and this is simply an easier way. It won’t like right in the main page, but opening the image separately will make it much clearer and easier to read. I’m not going to touch on every prospect individually, but when I do my Top 30 prospects list, I will be referencing this material, so I’ll touch on lots of guys.

scorespositive

Here you have a list of the Phillies hitters who posted a positive score. Domingo Santana, who I raved about a few months ago, had a very impressive season. Its easy to get carried away over rookie ball numbers (D’Arby Myers, just a few years ago, burned me here), but unlike Myers, Santana’s season looks much more legit. His 10.8% BB rate was very good and helps to ease some concerns over the 31.7% K rate. His .364 Secondary Average is very good, as is the .220 Isolated Power. Its important to remember that he was just 16 for the first 6 weeks or so of his season. He’s obviously going to have to cut down on the strikeouts as he climbs the later, but at 6’5/200 lbs already, he could grow into an absolute monster in the Michael Taylor mold. He’s so young, but has shown massive skills already. He’s a long way out, but I’m really excited here. In a similar light, Sebastian Valle posted a solid score of 53.92, the second highest score among Phillies hitters. He got off to a slow start at Lakewood, but even considering those numbers, he basically destroyed the New York Penn League, a very tough hitters league. Most high school guys in the NYPL are 19 or 20, while most college guys are 21, 22 or 23, so Valle was still one of the younger guys in the league. His overall rates (BB, K, and SecAvg) are brought down by his performance at Lakewood, but he put up an .866 OPS in the NYPL. Domonic Brown and Michael Taylor were compared all season, and not surprisingly, they scored right in the same neighborhood here, with Brown slightly ahead of Taylor. They posted very similar rate statistics, Brown drawing slightly more walks and Taylor striking out slightly less. Both posted excellent SecA’s, with Taylor having close to a 20 point advantage. Taylor has shown a bit more power, but is also 2 years older than Brown. Both guys are very good prospects at this point, with Brown having a bit more projection.

Of the guys who we maybe focused on a bit less, Leandro Castro and Jonathon Singleton stood out here. Castro’s numbers were solid considering his lack of a real track record in the US. His walk rate doesn’t inspire a lot of confidence, and his .255 SecA kind of backs that up. Singleton, on the other hand, shows a lot more positive signs. He posted a stellar 15% BB rate and backed that up with an 11% K rate, a very good indicator. His .340 SecA ranks him just below the big 3 of Santana, Brown and Taylor, and though he didn’t hit for a TON of power in the GCL, he has a good projectable frame, and the power should come. Maybe somewhat surprisingly, Tim Kennelly ranked above highly touted Travis D’Arnaud. Kennelly posted an .802 OPS across 2 levels, while showing good discipline and contact skills. His .268 SecA also ranks above D’Arnaud’s .257 mark. D’Arnaud, as we know, got off to a really slow start and started to come on strong in the 2nd half, but splits in the minors are tough to really gauge. Matt Rizzotti was 2 years too old for Clearwater, but nevertheless had a nice season. He showed decent plate discipline, but a sub .200 ISO probably isn’t going to cut it for a corner infielder. Then again, I could envision him being something like a Greg Dobbs type of player, which I suppose wouldn’t be too awful as a reserve type player. But that’s a ways away.

Here is the next group of hitters

negscores1

The beginning of the negative scores. Again, a score of zero means the player was basically average for his league, but this takes into account the entire player universe, not just the prospect universe, so an average minor league hitter (a score of zero) is probably below the threshold of average hitting prospect. D’Arby Myers score actually comes in right under the zero line, but his peripherals are not good, especially the .200 SecA. You’d expect him to be higher here, especially with the speed that was advertised when he signed. His plate discipline still has not improved. Cody Overbeck, who I wasn’t high on at this time last year, had a very disappointing season in Clearwater. A .169 ISO from a corner infielder doesn’t inspire confidence, but he could end up maybe being a half decent utility infielder if he could show competence at 2B. Troy Hanzawa put up one of the most empty .267 batting averages you’ll see, but he’s know for his glove anyway. Quintin Berry shows up here, and the story hasn’t really changed. He shows good plate discipline, but he has very little power, and even his speed wasn’t able to give him anything more than a .258 SecA. Alan Schoenberger, who I don’t think I’ve ever even mentioned on this site, put up mildly interesting numbers as a 20 year old 2B/3B, including a 12% BB rate and a .270 SecA. Another Australian product, there probably isn’t anything to see here, as he’s struggled to even hit .200 as a pro, but a walk rate above 10% always stands out.

The final group of hitters

negscores2

Hey, there’s Anthony Hewitt! Not surprisingly, it was a rough season for Hewitt, who sported a 31.2% K rate, the same as Domingo Santana. Unlike Santana, Hewitt showed little to be excited about, with only a 3.6% BB rate and a .227 SecA. The .172 ISO is about the only thing to hang your hat on, but simply put, he’s been brutal at the plate. With the latest news of Hewitt’s move to the OF (which I alluded to a few weeks ago), the offensive bar is going to be raised unless he can stick in CF. A corner outfielder will absolutely be expected to outhit a 3B in most cases. Anthony Gose, arguably the fastest player in the minors, ends up well down the list here, and its not that surprising. His walk rate is pedestrian, and he’s shown almost no power. Scouts think the power is coming, and if it does, he’ll certainly jump up the rankings. His .273 SecA is driven by his massive stolen base totals at a relatively high success rate. 2009 draft picks Aaron Altherr and Kyrell Hudson are long on tools and short on polish/performance at this point, but again thats no surprise, especially in such a small sample. Travis Mattair, a consistent source of frustration for me, also fairs poorly in the system. His 10.4% BB rate was driven by his first half, as he drew 39 BB in his first 3 months, but only 15 over his last 2+ months. The power is still non-existent, despite a strong physical frame. He has no speed to speak of, so his bat is going to have to carry him. He plays solid defense, so he’ll keep getting chances, but the bat really has to emerge at some point, hopefully some time soon. Freddy Galvis, like Mattair, is a solid defender, moreso even, as his defense has been labeled elite already at a very young age. With the bat, however, he appears hopeless. This could be because he tried to learn how to switch hit, or it could be that he just can’t pick up breaking balls, or any other reason. Right now, he may be Ozzie Smith with the glove, but he has no power and he doesn’t draw any walks, so its tough to project him as anything other than a reserve infielder. When John McDonald is your current best case projection, its tough to get too excited. Maybe the most surprising guy in the entire system is Zach Collier, who ends up posting the lowest score of any Phillie. 2009 was essentially a disaster for Collier, who flopped at Lakewood and then performed poorly at Williamsport. He was only 18, very young for Lakewood, and was obviously in over his head. His peripherals were poor across the board at both levels, and its time to throw on the brakes in a big way. I can’t see him going anywhere other than Williamsport in 2010 and hopefully he gets back on track.

This concludes the introduction and the hitter portion of my system for the Phillies. Soak it all in, ask questions if there is something you are curious about. I’ll roll out the pitching scores on Monday. Please forgive any typos/terrible grammar, its a long post and I don’t have time to triple check it.

66 thoughts on “Introducing the SONAR Score

  1. I have been looking at developing a metric that will allow me to rate a prospect effectively and with prejudice. I’ve been unable to define one that works well. There’s always an X factor that blows it up. I applaud your effort. I also like your sensibility. You know that a guy can look horrendous for a couple of years and then suddenly it all clicks. Maybe in the post-steroid era, we’ll see less of this but I’ll bet we’ll see more of it. Natural talent will rise to the top, even if they’re a basket case in their early years.

    You also know there will always be a place for a true scout. First-hand reports can point things out that no statistic can derive. It might be a knowledge of the game that will allow the prospect to over-achieve or an arrogance that causes a guy to under-achieve.

    Starting with a number that you can then match against other prospects as a starting point is tremendous. You start with a “SONAR” score and then put in the intangibles. Its a good way to rate the prospects. Of course, gut feeling will also enter my thinking when developing my top 30.

    Thanks for this site and the exceptional effort you put into it.

  2. ps. I noticed the server where you site resides is still on daylight savings time. Or maybe it thinks I’m in Bermuda.

  3. Are you able to explain the red and the green highlights? I’m assuming it means really good/bad, but there are some that would be better than others and aren’t highlighted.

  4. Did you trying running the same formula with a player’s entire minor league career and even collegiate career? If so did it make any significant changes?

    Did you try running it with established major leaguers to see how it came out for them?

    It will be interesting to see where the ex-phillies in the Lee trade come out. It seems there is a bit of a consensus with them that might make them a good reference point.

    I must say I’m a little surprised at where Brown and Taylor landed. I would have thought and hoped they would be solidly in that average to above average range.

    But I love the idea. I’m looking forward to seeing how it works out.

  5. Wow! Talk about a labor of love – nice work. It will take me some serious time (which I don’t really have) to digest this properly.

    I guess James feels like he had to outdo Schwimer’s Schwimlocity stat.

    - Jeff

  6. You have way too much time on your hands. Good batting average, good power = good hitter. Low ERA, good strikeout/walk ratio = good pitcher.

    Not rocket science.

  7. James: not finished reading yet, but the SONAR column of each of the graphs is cut off for me. Is there anything you can do to fix that?

    Jake: I would hope you’re being tongue in cheek there.

  8. Whoops, I obviously didn’t read far enough down. I opened the pictures in new tabs and was able to see everything.

    You’d been alluding to this for some time, and I have to say that I’m very impressed with the finished product. Adds a whole new dimension to prospect analysis, and should prove to be a very useful tool going forward.

    Looking forward to seeing the pitchers on Monday.

  9. what are the chances of francisco murillo being back with the organization next season. also will tim kennelly ever be a major leaguer if so how many years away is he

  10. Murillo is a fringe prospect at best. 50-50 on whether he is back. Kennelly needs to improve to be a major leaguer. His offense is good enough to be a backup catcher at this point, though his defense there is suspect. He is a useful backup at this stage, but needs to take a couple more steps forward to earn even a major league utility role.

  11. k so should they protect kennelly if murillo is back next season he will be rule 5 eligable at the end of the 2010 season any chance he could ever get picked up there

  12. when u compare kennelly to ruiz in defense stats kennelly is better in rf/g pb and cs with a 36% not bad i dont think his future is in phillie though

  13. I am the type of learner that needs to see the formula. It seems very well thought out and would like to see the formula or atleast how every other teams prospect match up. Can’t wait for Monday!

  14. I don’t know if it was taken into consideration, but like in the case of Freddy Galvis, shouldn’t fielding skill be included. Some players, especially catchers, sometimes are more a defensive oriented prospect anyway. Not critiscism (sp?) just a question.

  15. Jake, did you read the post? The whole point is that the most prominent indicators can be misleading outside of their context and that for minor leaguers we need to incorporate as much data as possible in order to come up with a more accurate projection. If you put Eric Brunlett in low A in a hitter’s park, he’ll put up really good numbers. Eric Brunlett is not a good hitter. The same can be said for college draftees that enter low in the system and dominate. James attempted to incorporate context and more obscure, but possibly more indicative stats in order to be able to differentiate between a Jason Heyward and a 26 year old college guy playing a bunch of 19 year olds.

  16. I can’t even imagine how long this took to do. The results seem reasonable which is always the first thing tio consider. Obviously, your formulas put great credence into age because Santana jumped to the top of the list because his performance was adequate while being so young and Taylor’s performance seemed to be downgraded because he was a bit older. Age is something I’ve struggled with in evaluations. While I agree that a 23 year old at Lakewood should have an advantage, I’m not sure that a 23 year old at AA is any different than a 22 or 21 year old. Personally, I’m fine with 25 year old rookies in this day and age of free agents. I want my rookie ready to play as the years tick away towards free agency. Its not the old days of a guy coming up as a kid and playing 15 years for one team. One thing I look at closely is when I see a guy get bumped up a level mid season and his performance falls off (such as happened to Kennelly). Some players never regain that edge or whatever they had that made them effective at the lower level. I’m sure its hard using your formulas when there’s a mixed season involved between two levels with vastly different results. Best case in point: we all love what Valle is doing this winter and he hit a ton in SS but won’t we all feel better when he hits at Lakewood next year? All in all, this is great work and appears to be a terrific additional source of conversation and evaluation and you are to be applauded. (clap, clap)

  17. Very cool idea, I enjoyed the article and analysis, and look forward to discussing over the offseason.

    I don’t have any idea where you’d go to get the data, nor how long it would take, but have you gone back to 2004,2005,2006, etc, and ran the Phils minor league system with data at those time points? It would be interesting to see how the scores moved over time for someone like Utley, Howard, Bourn, etc as they moved through the system, and as a check to see if scores project into a MLB player…

  18. A few notes

    * Right now, I only have the data for 2009. Its a representation of what happened this season. All of this data was copied by hand from baseball-reference, because I’m not tech savvy enough to develop a way to mine the data automatically. So I had to go to each minor league affiliate’s main stats page and copy all of the info by hand, then paste it into excel. In cases where a guy played at multiple levels in one season, I had to use a separate sheet to total his statistics and create one composite line. Its a very time consuming process. Now that I have the formulas and sheets set up, its just a matter of me copying data. I’d like to go back and look at previous seasons, and I might do just that, depending on time.

    But going forward, I’ll have all of the old data saved, so I’ll be able to eventually expand it and look at a player’s entire career under the system to get a better idea of how players progress.

    * As silly as it might sound, I don’t want to publish my formulas in case this system proves to be really valuable/accurate/useful. The people at Heinz don’t publish their ketchup recipe for anyone to use, and they don’t do it for a reason….

    * As for fielding, its tough, and that’s one area where I think you absolutely have to rely on scouting reports. Even the most advanced defensive statistics at the Major League level (think John Dewan’s +/-) are suspect. Freddy Galvis may be the best defender on the planet, but I don’t think his minor league fielding statistics like fielding percentage and range factor would tell you that. Its also tough to really know how much to value defense when evaluating a prospect. If Galvis is barely a .200 hitter in the majors with no power, can he be a starting shortstop, even if his defense is amazing? On some teams, maybe, but not on a top tier MLB team. Defense and position is something to subjectively consider when ranking prospects, I just chose to not include it in the score.

  19. so,if someone has a bad Sonar year, but then comes on the next year, do we get to say they were “flying under the Sonar?” No, wait, “swimming under the Sonar?” And you can say that somone has “sunk to the depths” of the sytems if their Sonar number is among the lowest…
    Interesting effort, thanks.

  20. Oh, and as for the red and green highlights, theres nothing scientific about it, I just made some notes of guys who had good/bad numbers in those categories while I was going down the list. I didn’t highlight everyone, just the ones that jumped out at me

  21. Have you tried exporting your data from Fangraphs minor league leader boards for each league, exported to excel or .csv?

  22. Wow, i like it, objectiveness is always a valuable thing to have, especially when some people, myself included tend to gloss over certain unfavorable aspects of our own favorite players/teams

  23. Dan:

    How can we evaluate your results if we don’t know the formula. You have ranked people below average that some thought were above average and vice versa. Yet I don’t know how you came to that conclusion. With Ketchup atleast I can taste it.

  24. I believe the formula has to be patented. If it is published at all then there is absolutely no way the PTO will allow it will be patented and then anyone could use it without compensating. What is considered publishing is a shaky question, which would probably include posting it on this website and could even include telling it to a friend without making him sign an agreement not to disclose it. I’m not an expert but I would not disclose the formula at all until you seek legal advice.

  25. And I think I did a good enough job with my qualifiers in the intro. I’m not saying this is going to be a hugely valuable tool, or that you should trust it any more than anything else. Its just a data point, and something else to consider.

  26. Slayden already retired, didn’t he. Surprised to see Kenelly, Rizzotti, Murphy, Susdorf so high, but I guess that is a consequence of looking just at hitting, not position and defensive ability.

  27. I seriously tip my cap to you. You did an amazing job on this. The fact that it basically passes the sanity test (the top prospects are still at the top) is good enough for me for now. I agree on not publishing the formula. Why waste all your hard work like that?

  28. Lol its a little bit ironic that you won’t post the formula and “waste all of your hard work,” all the while spending an incredible amount of time and money running a website that you won’t put adds on.

    That being said, its pretty damn cool, nice job.

  29. I’m not sure I care all that much about the formulas. I throw around terms like OPS+, ERA+ etc, and I don’t know those formulas, I understand the methodology and what they mean but I can’t reprodcue them. I understand this is a weak example, but I don’t see the need to ‘know’ the formulas we can still discuss, review, critique without the formulas.

    If you are looking for more ideas on posts about your system, I’d like to see how sensitive the results are to different factors. For example, what would happen to Santana, Valle, Brown if they all were aged 2 years.

  30. Looking at the chart, I’d say it indicates we are short of talent certainly at the upper levels. Brown and Taylor were supposedly two biggies. If they are just average, we have a long wait ahead of us in Philadelphia.

    Better keep what we have. lol

  31. How does the system work with historic data. Like Michael Bourn in A ball or Ryan Howard’s First season at AA. Does it project them to be major leaguers? I ask about these two players because they are extremes. How did SONAR look at the prospects we traded for Cliff Lee?

    Just a thought

  32. I was thinking the same thing as Joey. Sorry James, but I’m hoping SONAR is way off!

    Actually, what might be interesting, is to calculate the SONAR scores for Jimmy Rollins, Chase Utley, etc. from the minor leagues and see what type of scores they would have received. If you get high scores, then you might really be on to something here.

  33. Wow! Thanks for putting this all together and posting it. A fantastic, interesting read. Look forward to seeing more of it.

  34. Wow, your nuts I don’t think i would have the patience to do something like this but i hope you can make some $ or hell even get a scouting job offer.

  35. Phuturephillies –
    Am I correct in assuming that your rating does not include baserunning or defensive position. I know that you said it does not include defensive skill, but do you give any bonus points for playing C or SS, rather than 1B or LF?

  36. I’m gonna wait until I see how this works until I like it. until then, I’m not gonna trust it. nothing personal, I just like to know how evaluations work before I use
    them.

  37. allentown,

    the score itself does not take stolen bases into account, because i couldnt figure out an easy way to make it work, or a practical way really. but that’s where using the SecA is handy. Because a guy with a low score but a higher SecA could mean he’s getting more out of his other tools, but not quite getting on base enough or hitting for power, or something like that.

    I place less value on stolen bases in the minor leagues for a number of reasons, so its not incorporated in, but even if I did have an easy way to account for it, id place a smaller weight on it.

  38. Normalized (rather than neutralized)

    Means that you don’t count the hits or the strike outs, but look at percentage hits per at bat or percentage strike outs per at bat, in order to compare two hitters with different number of at bats.

    Very nice work.

    Gose steals bases like crazy, so I see how you would discount them. I think Toronto wanted Gose in the Halladay deal, so he must have some sort of value.

    I am interested in the pitching system as well.

  39. so basically you are saying that brown and taylor will be average to above average players, which means that they are way over hyped. you might be right…or wrong. will be interesting to see how it plays out.

    have you gone back and applied this scoring system to ryan howard, chase utley and jimmy rollins minor league stats? if so, where did they grade out in this system considering you have two league mvp’s and the best second baseman in the game today and one of the best of all time.

  40. When statisticians create a new model, they evaluate how well the model fits the data.

    And PP Fan is hitting upon this.

    It would be very interesting to see how the SONAR did over the last handful of years in predicting an MLB future.

    There will be guys that made it regardless of SONAR scores.

    There will be guys who didn’t make it.

    If I had to bet a dollar, I’d place an initial estimate that SONAR is cirrect 75% to 80% of the time.

  41. james – i applaud your efforts. i am sure you spent a ton of time analyzing this and tweaking your model. and you are accurately trying to figure out a puzzle that has plagued baseball forever. so kudos to you.

    in version 2.0, i think that you have to figure out how to account for steals. it shouldn’t be that hard as there are stats related to how many bases a guy has and how much that guy scores. so steal rate should significantly impact the ability to score runs. also, it is one of the 5 key tools. and for a guy on the extreme, like gose, it is un-fair to not reflect this element of his game. without this skill, he wouldn’t have been drafted so high. guys like mike bourne are valued very highly in mlb for this skill. so it is important and should be something that can be calculated.

    just food for thought. good try for the first round.

  42. There may be artificially inflated stolen base stats in the minors.

    Pitchers don’t hold the runner well and catchers do’t make good throws.

    Further, in instructional leagues, who really gives a hoot if the runner on first steals.

  43. Baseball Prospectus has something called a speed score which is probably appropriate here as well. Take several indicators of speed such as steals, triples, etc.

  44. CouchKing – i agree that all stats in the minors are inflated. however, relatively speaking some guys out perform others. that is the key point. it is all relative.

    when a guy has a scouting report of:
    Power: 56
    Speed: 100
    Contact: 56
    Patience: 35

    it is kinda un-fair to just ignore the category of speed. and his 76 stolen bases in 131 games is great in any league. the guy has some wheels. relative to any level he is in.

    as a bench mark – michael bourn had 58 stolen bases in A ball (at 21) and he just led the national leage with 61.

    as another bench mark – jacoby ellsbury is graded as a 95 in speed and had a peak of 33 stolen bases in the minors and had 71 this year.

  45. I think its important to also remember that a player’s speed doesn’t mean a whole lot if he

    a.) cant get on base
    b.) does not have good base stealing instincts

    The second can be taught, to a degree, as we’ve seen with Davey Lopes helping current big leaguers. The first one is a lot tougher, and some would argue that it cant be taught. If Gose never develops with the bat, he’s a 4th/5th OF, kind of like Joey Gathright. If he shows better hitting skills next year, he’ll certainly rise up the rankings.

  46. You can’t steal first base.

    PP Fan: for another example, check out Joey Gathright’s Baseball Cube rankings…

    Power: 72
    Speed: 94
    Contact: 64
    Patience: 56

    And he’s posted a career major league 68 OPS+. It doesn’t matter how fast you are if you can’t get on base, whether by walking, being an extreme contact hitter, or some combination of the two.

    Plus, steals are accounted for indirectly in secondary average.

  47. I think people are prejudging Gose a little too soon. I know that no one is saying he WON’T be good, but it seems like a lot of people already think his bat won’t play. Remember he was 18 and managed a .270+ average in Low-A ball. That’s pretty impressive.

    If he can refine his batting eye a little, he will be a threat.

    He should get with Ricky.

  48. obviously i know that you can’t steal a base unless you are on base first. that comment kind of dismisses the point. as does pointing out other fast players that didn’t pan out. of course there are guys who don’t pan out who have speed (although Gathright has a .263 career mlb batting average, which isn’t horrible). my point wasn’t just because he has speed he will be an all star. the point is, that speed is one of 5 tools. ignoring it (or under-weighting it) in a prospect grading system makes the system flawed. now i am sure that this will piss people off. but all systems need refinement. i congratulated james on the attempt, but i don’t think that he has the final answer yet. it was very reasonable constructive criticism. i even offered ideas on how to measure the impact.

  49. I still can’t figure out a way to view these Sonar scores – any idea? I tried your suggestion, but it didn’t work – I have no ability to see the scores.

  50. You need to find out how to open the image. If you’re on a windows based system, the right click then “view image” option normally works. Not sure how to do it in a Mac. Sorry.

  51. As for valuing speed, this is my more detailed take.

    Speed is partially accounted for in slugging percentage. When people think slugging percentage, they immediately think home runs, but triples count more than doubles, and you’ve got to have big time speed to rack up triples.

    The actual stolen base itself is really tough for me to value. If you’re stealing at less than 70%, you’re hurting your team. To make a positive impact, you have to realistically steal at an 80% clip.

    Which brings me to incorporating the SB into my score. I chose not to because of 2 main reasons.

    1. I don’t trust stolen base totals in the minors. Lots of pitchers, especially RHP, struggle with holding runners. When you read scouting reports on college guys entering pro ball, you often read “has trouble holding runners” or “doesn’t have much of a pickoff move”, and its something that takes time and instruction to fix. Also, the catcher is the toughest position in baseball to play. Some teams stick guys who can hit behind the plate, in hopes of them being able to be at least adequate back there. And most of these guys don’t make it to the majors as a catcher. A guy with Gose’s raw speed should be able to steal a ton of bases against pitchers who have no/bad pickoff moves and catchers who are still learning how to play the position. But his stolen base ability in Low A doesn’t really mean a whole lot to me. His speed will certainly be a defensive asset, and there’s a good chance he can swipe 30-40 bases in the majors. But I don’t think it should be something that is weighted much at all when thinking about how good an offensive player he’ll be. Which brings me to

    2. Stolen bases simply are not a hugely valuable asset. Don’t take that statement the wrong way. Stolen bases, when stolen at a high success rate, are valuable. But its also valuable to be able to consistently go from 1st to 3rd on a single, and score from 2nd on a base hit. Those are statistics that I couldn’t possibly capture because I don’t have access to data like that at the minor league level, and I don’t know that anyone not working for an MLB team has that info. The cliche “you can’t steal first base” is valid. And its something I put a lot of stock into. OB% is the single most important statistic you can look at when evaluating a player’s worth. 27 outs in a game, the goal is to get as many guys on and in before you make 27 outs. A guy who gets on base less than 30% of the time is a liability. Slugging percentage is vitally important too, obviously, because guys with no power who hit an empty .275 are less valuable than a guy who hits .260 but has a .350 OB% and a .500 slugging percentage.

    So to make a long story short, I chose not to include the stolen base because I don’t think its that important. There are very few players in baseball who fit the empty .275 batting average, 50-60 SB mold and are considered highly valuable players. Juan Pierre is just about the best case scenario for any slap hitting speed demon, but even in his glory days, Pierre was overrated by the masses. And here’s the proof. In his best statistical season, 2006, he posted a WARP3 of 4.0. WARP3 is Wins Above Replacement Player, adjusted for era of play and league. His 3 best seasons after that were 3.2, 3.0, and 3.0.

    Contrast him with a slow, plodding player who hits for a low average, never steals a base, and hits for power, say Pat Burrell. Burrell’s monster 2002 gave him a WARP3 of 7.6, and his next best readings were 4.6, 4.2, and 3.1. His best season was worth about 3.5 wins more than Pierre’s best season, and he was a very limited offensive player. But his skills (drawing walks, hitting home runs) are much more condusive to scoring runs. Of course you have to factor in defense (though Pierre’s arm is awful, so hes not really a plus player there either) but you get the point.

    Players who steal tons of bases can be valuable, but the Gathright example fits my point perfectly. He has ridiculous speed, he can jump over cars, but he wasn’t really good at baseball, and he never learned to get on base. His career .267/.327/.303 line really says it all. His walk rate wasn’t horrible, but he had no power at all. His best season by WARP3 was 1.1 on two occasions. All of that speed couldn’t help him ever crack the 350 PA mark in the majors. Teams tried, and they quickly gave up on him every time.

    I’m not trying to say Gose is the next Joey Gathright. I’m just saying that unless he learns to either;

    A.) hit for a high average
    B.) hit for power

    His offensive value will be limited, and that’s what my score is trying to determine.

  52. Phuture…thanks for the explanation. It’s obvious to me (and I think everybody else) that you’ve put a ton of work into something that you thoroughly enjoy. Thanks for sharing with us and putting up with the “knee-jerk” reactions of somebody who just spent 15 minutes reading something it obviously took you much longer to create. I’m sure that you’ve spent a lot of time considering some of the ideas being thrown out and how to handle them before they were even suggested on the board, but it is hard to convey that in this space.

    At any rate, I’m anxious to see how the Phillies teams rack up against some of the other teams out there and get a wider view of the SONAR score across the league. Also looking forward to seeing the pitchers scores. Thanks again for sharing.

  53. A fast enough player who learns to bunt well basically can steal 1B, or force the defense to position itself in a way that permits hitting for a higher average in the absence of significant power. Bunting is a different skill than hitting. Gose should learn it. Really Howard should learn it also.

    Gose has done very well given his age and level in the organization. He should get stronger as he ages and hopefully learns to recognize pitches better. What I see to date is a lot more positive than negative.

  54. If Howard could bunt consistently down the third base line, eventually that would force the end of the shift, as he’d have a base hit every time he got a fair ball past the pitcher. If he can learn to slap grounders to the left side, that would be equally good, although I think harder to accomplish. Howard is losing 30 – 40 points off his BA because of the shift.

  55. if he learns to bunt, the only advantage that would bring is that he would move the 3b who would be playing at ss back towards 3b. there would still be three playing infielders on the left side of the diamond. how often does he hit a grounder to the ss area?

  56. Looking at Fredddy Galvis’ number at negative 44, I would be disappointed with him as a prospect.

    Galvis 19 and was at 2A atthe season’s end, where I read that he was doing okay and not overwhelmed.

    So, we have a 19 year old smooth as glass defender at 2A.

    Was his season as a hitter so poor early on that his SONAR score was tanked?

    I guess that these are a full season’s worth of data. It would be nice to see the SONAR by month for a player to see how they are developing during that campaign.

    At the end of the season your SONAR score for August could be much higher than May and your August score reflects your new value to the organization.

    The downside to monthly scores is that everyone knows that baseball is a very streaky game. Aplayer could have a great month or a terrible one.

    You could also graph a players monthly SONAR scores to see how they are trending, whether one month was just an anomoly of some sort.

  57. so, a scouting system that doesn’t take fielding, arm strength, speed, intangibles, or offspeed pitches into account. Sounds like a great idea. :o)

Comments are closed.