NPI data, the doctors social network

(Update Feb 18 2011: search.npidentify.com has moved to docnpi.com, I have adjusted links accordingly)  I have been working, part time, on a project for nearly two years to dramatically improve the quality and depth of information that is available on the Internet from the NPI database. For those not familiar, NPI or national provider identifier,  is a government issued health provider enumeration system. Anyone who bills Medicare or subscribes medication now has to have an NPI record, which basically means that it is a comprehensive list of individual and organizational healthcare providers in the United States. You can download the entire NPI database as a csv file under FOIA. There are a little over three million records in that download.

Each healthcare provider provides both credentialing and taxonomy data for inclusion in the database. Healthcare provider taxonomy codes are a fancy way of detailing just what type of doctor you are. Because each provider -can- provide such rich data, there is a tremendous amount of un-used information in the database. NPPES does not do very much data checking, so there is a lot of fat finger data too. I have been working on scripts to improve the overall quality of the data as well as accelerate some obvious  datamining applications. I am happy to announce that after several years of development I am ready to beta launch a dramatically improved NPI search service.

Please visit docnpi.com to try it out.

I have recorded several videos that I will attempt to embed here to show you just what it does, but for those of you who prefer to read:

  • The NPPES search engine has a limit of 150 results, docnpi has no limit
  • NPPES does not allow you to search by type of provider or organization, docnpi allows you to search by both type and group of types.
  • NPPES only lists one taxonomy per provider and it is often over-general, docnpi lists all provider taxonomies in each result
  • NPPES pages the results, while all of the results are listed at one page at docnpi (lets you use your browsers ctrl+f function to do quick sub-searches)
  • the results from any search you do is downloadable as json, xml, or excel/csv
  • No search, except an completely empty search, is too general for docnpi. If you can wait for us to process the data, we will do it for you.
  • Each NPI page automatically exposes the “social network” of any provider or organization by listing all other NPI records that share addresses, phone numbers or identifiers
  • Each NPI page displays a google map for the practice and mailing address listed in the NPI record.

I have lots more features on the way, and I know I need to optimize the site. Loading single NPI record takes too long, because I am doing several huge SQL queries across a very full database. Still if you have some patience you can give me some feedback on the site now. Here are the videos that demo specific searches and expose the data richness of the NPI dataset.

Videos

Implications

Most people have no idea how much information is truly available in the NPI database.

The one group of people who will probably immediately find the site helpful is medical billers, or insurance company employees who want to understand the relationships between different providers. They have been frustrated by the NPPES search tool for a long, long time.

But most people have no idea that this kind of information is even there! I cannot tell you how many people have no idea, for instance, that public health offices very often have an NPI record. Just looking at the taxonomy drop down, should be very enlightening. Using this search engine, you can get very specific and detailed information about the relationships between location and healthcare provider density. You can ask questions like “How many foot doctors does Denver have per-capita compared to New York”. Before docnpi.com, one had to download the data yourself and then run your own queries. But the data download is not normalized and it almost impossible to determine who shares an address unless you normalize across address. Even then without database optimizations (I have learned so much more about MySQL optimization on this project…) complex queries could take hours to complete. The site probably will feel “slow” to you because it can take a long time even to analyze the data for a single provider (30-50 seconds) but many of the matching provider data displays would have been impossible before. I hope to do more optimization and other improvements and I would like to have your advice doing so. Please click the red feedback tab and tell me how to improve the site!

Essentially this site will make the NPI data set far more accessible that it has been before. Stuff that is now easy to do, was previously the domain of expensive data toolkits or data mining experts. This data should be usable by everyone… and now it is.

I should be frank, this site has to pay for itself.  I have not decided how to charge or what I should charge for or even if I should charge. I will probably have to think about this once the number of searches starts to take the server to its knees. Once that happens I will have to spend “real money” on a dedicated virtual server/cluster and that will mean the site must be monetized somehow.  I will be probably end up limiting the number of searches that a given user can do until they pay $20 or something like. That will let most people use the site without paying, but when people start to overuse my CPU cycles they can afford to pay a little. But until my server starts to choke, everything is free. Enjoy.

Posted in NPI

8 thoughts on “NPI data, the doctors social network

  1. Hi, I am looking for a way to use physician info like NPI to authenticate physicians for a website access.
    Do you have any suggestions on using your database for this? If so I would like to discuss this with you.
    Thanks
    Kent

  2. The most useful output of the NPI information would be in an excel file format or a csv file. Are you able to provide a conversion to excel, csv, or enable the loading of external data into Excel feature?

  3. Attempting to get a complete nationwide list of all hospitals (5 taxonomy codes). Would love to see an all option in the state box rather than having to do 255 queries to get all of the data (50 states + DC times 5 codes).

  4. We use this site as a requirement for claim processing. The site is increddibly slow just to get to the “search” option. This is without actually inputting an inquiry. Is this just related to traffic? The site at times also displays an unreachable DNS error, not sure what that’s about.

  5. We are working to make the tool more reliable. We provide it as a free service (for now) so I hope you can understand that some downtime is unavoidable.

Comments are closed.