Would anyone use and contribute to an opensourced "Tortoise table" like database

jcase

Well-Known Member
Platinum Tortoise Club
Joined
Jul 27, 2021
Messages
402
Location (City and/or State)
Pittsboro, NC
Status:

Have fully moved to aws, and cloudflare. Using two servers now, with the database server being "offline". Database speeds are substantially better, cost is 1 cent higher a month.

My biggest concern was appropriately licensed data, but apparently that isn't a concern. I'll be processing data for days.


This week will be slow, I'm meeting up with @Benjamin to acquire a turtle I've wanted for over 2 decades, this weekend and with any luck completing my turtle bucket list as much as possible (ploughshares not being likely for me). FWIW, Ben might just be the nicest person in the turtle world.
 

jcase

Well-Known Member
Platinum Tortoise Club
Joined
Jul 27, 2021
Messages
402
Location (City and/or State)
Pittsboro, NC
I'm quite shocked at how fast this is coming together. Thanks to some innovative ideas from Steve, Tim, my wife and myself (probably others I've forgotten) we have been able to build out a massive amount of appropriately licensed relevant data quickly, and using some novel automatic techniques, it is being processed and integrated to the system at a rapid rate.

At the bottom of this is a list of MOST (not all) of the planned features. Since I control all the code and infrastructure, and know it well, it is mostly trivial to add new features. If you have a feature request that does NOT compete with what is offered on TF, please tell me. I do not plan to recreate GOOD & FREE solutions that are already available, but if there is something you want that doesn't already have a good & free solution, please suggest it.

I am really thankful for the quality assistance and support I've been given from fellow TF members. I am open to all questions, comments and suggestions, this is entire project has been driven by them.

Release date plans​

Shooting for launch of either Christmas or new years, pending code review and security audits. I only need a day or two of coding to complete the code related components, but work has me in a crunch, it will have to wait until I have some time off. I am continuing to process the data we have collected through out the day, as that process is fairly automated and requires little time of mine, just use of my personal servers.

On going costs:​

Not including one time licensing costs of some data, yearly cost is currently slated to be around $400 to operate, This slightly increases from earlier estimates as I noticed some speed and reliability issues with the infrastructure, and chose to go with a substantially better model. I plan to c+ises, I may start to offer custom designed care equipment on the site (such as the high accuracy thermostat I've been designing, or another project I have been bouncing around with another TF member) to supplement costs.

Infrastructure and security aspects:​

>Stability and speed​

One of the driving factors for me was that I could not use TTT for a number of weeks, either it was entirely down, or the parts I wanted to use were non functional. In the end I have chosen to run TP on amazon's cloud, with the ability to spin up new servers, or move TP's servers to other databases quickly. Additionally all traffic is sent through cloud flare's CDN/proxy. In the event our site goes down, most of the static pages (What you can see without logging in) will be available seamlessly to the user through their cache.

I have written some fairly innovative solutions in regards to speed. The more the site is used, the faster it will get.

>Security​

As a offensive security engineer, security is very important to me, and I'm generally appalled at the security issues seen on many sites. As a result, I wrote all of the code (with the exception of one php library, and a few python libraries from trusted companies) from scratch using "security best practices". A network security engineer currently contracting with the DOD (a coworker of mine) is going to do an infrastructure security review. A former member of Facebook's red team (offensive hacker that tests their systems daily) is going to perform a code review and security audit. Another security professional who is a TF member has also offered their services, which I will take advantage of if they are still willing when complete.

The web server is entirely firewalled off from the internet with a hardware firewall, accessible only through Cloudflare (users wont be able to tell, except for the improved speeds). All traffic is encrypted, the site is not possible to use without modern SSL (https). Passwords themselves are not stored, but verified by one way hashing.

The database server containing user information, such as username, email, password is entirely offline, and only accessible through the webserver. It can not be directly connected to.

>Infrastructure​

The entire system exists on AWS (amazon cloud), and is distributed through cloudflare. It is written with PHP, data is stored a postgresql database hosted offline at AWS. Some of the more complicated behind the scene aspects, such as image optimization are written in python. The AI/nueral network aspects (species identification via photos) are written in python and will run on a system made specifically for processing neural networks I have hosted at my office.

Planned features:​


> Search engines - 90% complete​

-> Browse, basic search, advanced search for searching all of our databases.

>Plant database - 90% complete​

->350,000 species (not counting subspecies, hybrids and varieties)
->Photos of most (maybe almost most? I haven't counted) species (All legally acquired and licensed appropriately)
->Synonyms, common names
-->As a secondary option to diet groupings, Toxicity potential ratings, safe, safe as decorations, toxic, deadly, actual rating terminology is to be determined.
--->Range maps are planned for future 2023 release

>User account system - 75% complete​

-> Only required if you want your contributions to be attributed to you and not be anonymous.
-> Required for trusted users

>User editing and comments - 50% complete​

->Anonymous and entry level registered user contributions will all be reviewed by a trusted user
->Trusted user contribution do not need review
->All contributions logged and are able to be 'rolled back' in the event of vandalism or inaccurate information
->All content can be flagged/reported, reporting factual issues will require providing citations to back your claims

> Source recommendations - on going project, never complete​

-> Links to trusted sources for purchasing the item you are viewing, including plants, seeds, commercial diets

>Diet groupings/plans - ~1% complete​

->This will include what they should and will eat, and in what proportions if applicable
->This will need community support to complete for all groups, this will not launch complete

> Commercial diet database - 75% complete​

-> listing of known commercial diets and their nutritional analysis

>Comprehensive turtle/tortoise species and information database - 75% complete​

-> photos, range details, general information

>Article/Paper database - 100% complete, but no data at this time​

-> archive of scientific papers, articles and other tidbits. Direct downloads when possible, links when not

>FAQ/Q and A - 100 % complete​

->database of common questions with short answers, and links to longer answers offsite

>Mobile app - Planned for 2023 release​

-> a mobile app is planned, hopefully with integration into common point and click plant identification apps

>Neural network (artificial intelligence) based species identification system - 25% complete​

- Planned for 2023 release​

This is a project I was working on before that I will be combining with TF. Using photos, it attempts to identify the species of turtle/tortoise. If possible it identifies subspecies, and locality. It currently works for picking out radiated tortoises, Burmese star tortoises and various Cuora species from others. Much training of the AI model is needed (via pictures) to complete. This is a BACK BURNER project.
 

jcase

Well-Known Member
Platinum Tortoise Club
Joined
Jul 27, 2021
Messages
402
Location (City and/or State)
Pittsboro, NC
Slowly coming along. We have photos for about 1/3rd of the database. I'm off until next year starting Monday, and will go full speed then.

Initial launch will be a private beta, to work out the kinks, and to allow some pros to help us adjust what needs to be adjusted
 

Attachments

  • Screenshot_20221212-190540.png
    Screenshot_20221212-190540.png
    210.3 KB · Views: 8

jcase

Well-Known Member
Platinum Tortoise Club
Joined
Jul 27, 2021
Messages
402
Location (City and/or State)
Pittsboro, NC
Recovering from a shoulder injury, so I'm going a bit slower than expected. I do have a new

How about some screen shots?

This isn't final, there will be some changes, I know I have a few CSS issues to work out.


Home page:
Screenshot from 2022-12-22 17-04-48.pngsignal-2022-12-22-170626.pngsignal-2022-12-22-170628.png


Turtle Species / Diet listing search results:

Screenshot from 2022-12-22 17-04-28.pngsignal-2022-12-22-170632.pngsignal-2022-12-22-170635.png
 
Top