View on Zencastr
00:00.00
James
Everybody thanks for listening to the James for montana podcast the new podcast in which I interview experts in the tech industry with the goal of slowly uploading the collective consciousness of tech into the cloud for more information on today's guest topic or how to be a guest yourself. Visit jamesfrommonana.com/podcast today I'm excited to welcome Jason Dolatshahi a self described data leader. With over a decade of experience leading and ml-driven data product development driving cross-functional strategic initiatives and growing leading high-performing teams to deliver transformative results through data technology. And product ledge growth Jason how are you man?
00:59.78
Jason
I'm great. That's a mouthful. Um, it's great to be here James and thank you for ah, having me.
01:07.25
James
So yeah, so how have you been? Um, what have you been up to yeah.
01:13.62
Jason
Yeah I've been all right? I have been you know I work in data obviously in machine learning and now has been a pretty productive time in terms of like. Practical outputs from the fields language models Generative Ai all these things are becoming really common to mainstream experience. It's kind of funny from the perspective of data science to have all of these eyeballs like turning in your direction all at once? um. But I've taken the opportunity to um, get my hands dirty with a lot of these new tools and like try to build new things using libraries and tools like transformers and some of the add-ons that go with transformers and like the the powerful techniques that of just being like unleashed and then also like thankfully open sourced. You know so people can get their hands on them and try stuff out. Um, yeah I could go on and on I mean there was a paper that came out last week about how you don't really need machine learning in the first place and gzip is is really the answer to all your questions. It's pretty fascinating. It's not quite that simple but like there's so much going on. It's just like an effort to keep up. You know.
02:20.92
James
And I love it. You're you're a mind in data science and I think there have been a lot of eyes turning towards it obviously as of recent mainly towards like dinner to ai. Um, could you give us like a rundown of What data science is what and ml is machine learning for those who aren't in that world of Data Science and ml.
02:48.77
Jason
Well yes, let me, um, qualify this by saying that there's no answer that's going to give you the kind of like linguistic closure. That's you're probably after I think that the language the language really lags the fields and it's sort of like retroactive in trying to describe.
03:06.55
Jason
Or differentiate between different phases and what's really like a long continuous kind of integral stream of applied research going back even to like you know, really broad public projects like federal projects in like the fifteenth forty s fifty s like early parts of the twentieth century. Doing things like statistics studying how to use data from like physical objects like maybe a submarine in order to make decisions. You know, radar systems etc. That's a lot of what we call like rroc analysis or even bayiian statistics comes from some of the original techniques that people were doing. You know, pushing a hundred years ago at this point. Anyway, it used to be called statistics now then enter things like Macbooks and distributed computing now you're talking about data science machine learning was always kind of like the computer science see I guess direction sort of like rooted in more algorithmic theory and thinking about asytotic behavior. And writing papers and stuff. That's where I think the language of machine learning enters the picture but it's all still kind of the same thing and then with deep learning and like black boxy model architectures then you get into stuff like ai like if it's not fully transparent. You can't actually like follow the dotted line etc. Maybe it's more comfortable to call that something like an intelligence. And then of course gender to Ai is just about these things that sort of like generate sensory experiences for people. All this language is very like I'm the first person. It's always describing it from the sense of like the customer of the language if you like.
04:37.59
Jason
It's generative because it's like creating things that I intake as Inputs to me I guess.
04:42.52
James
I yeah feel like that's a really good way to explain it all. the question I always have is I have the unique challenge sometimes I have a 5 year old of explaining my work to him and.
04:58.35
James
I think it would be really interesting if you took a shot at explaining to my 5 year old exactly what you're good at.
05:06.45
Jason
Um, yeah, the truth is. It's not so complicated I mean I think most of the effectiveness comes from removing complexity rather than adding complexity. Maybe that's like arguably true I think is really obviously true. Anyway, So What do I do. I Use data and computers to make decisions and predictions and to try to like reduce the amount of uncertainty in forecasts. Um, usually this happens through Mathematics Statistics scientific programming like Python and Unix environments play a big role. Um. Specialized scientific computing libraries like you know everything you may imagine that goes with the Python data stack numpy Pandas sometimes set like Sci Pi Scikit Torch Now Transformers Gradient boosted models like X Boost etc have their own like set of libraries that go with them. Um. So in like various layers of complexity. All it is is like again using data to try to reduce uncertainty I mean really if you boil it down. That's what it is. It's like being actively empirical.
06:12.58
James
Yeah, makes sense you mentioned earlier sort of the overlap or where you start this you mentioned earlier the blurred lines of whether or not a model is fully transparent. And you would call something that's not transparent more intelligence. How do you feel like I guess Data Science What was previously purely statistics overlaps with things like artificial intelligence.
06:42.99
Jason
Yeah I mean I think that the the crux of that whole description of that whole like timeline of the language etc. The the thesis of it is that it's really all the same it's really not different it's really it's really not that different in statistics as it was practiced like. Decades before other than you have a different kind of model architecture like at some points down if you like the value chain or like the evolutionary chain of this data from like where the data is generated to where it's being used to make a decision at some point like most of the way down the line. You have some implementation details that are different. Um, but the rest like the concepts and the thinking and the intuition certainly I mean if you think about you know, even in generative models and stuff you frequently have to think in terms of probability distributions or certainly like this always comes up when you're thinking about design experiments. You're thinking in terms of probability distributions most of the intellectual infrastructure for that is like hundreds of years old. You know if you're thinking in terms of normal distributions which is like sometimes true. Not usually true. You're pushing like 400 years old you know for other things maybe 300 but like still in some senses like ancient. Knowledge. That's like really practically useful.
08:01.59
James
forgive my ignorance. Ah, when you say distributions are you talking specifically about like what may be predicted or.
08:04.88
Jason
Oh my gosh. What an interesting subject. Oh no distributions like imagine if you look at imagine you are an ecommerce company and you look at a crosssection of your users. Um, going back for twelve months and you say okay I want to look at the shape if you plot like every users say spend on our platform for the year you're going to have some users who spent like basically nothing a large amount of users who spent some kind of middle amount. And then a very small amount of users who sent just that a huge amount right? So you have this really distinctive long- tail distribution the distribution, the concept of the distribution is about achieving that statistical shape with mathematics so that you have parameters that you can calibrate with data. In order to say okay, well if the data goes this way then like the shape changes in this way and then my predictions about say customer behavior in that example are different in this way. The distribution is really the statistical machine like the the geometric objects that allows you to do these things which is what I mean like that stuff doesn't change. That's like the same as it has been and. Old versions of textbooks.
09:17.56
James
That's really interesting. Yeah so I guess this kind of shoehorns your example of you know ecommerce and your user spend into a question that I wanted to ask you. I want to understand what you think the role of Data Science is in and engineering specifically within inside of an organization. Yeah.
09:40.64
Jason
The role of Data Science in engineering I think that frequently they're compliments I think that engineering is about building systems and building systems at scale and are efficient. Um. To operate and to maintain and I think data science is about like I said looking at data and using tools like mathematics and technology to inform decisions so in a strong sense. You can't do one without the other and frequently you use tools from engineering and building. You know machine learning services certainly there's a lot of intersection there. I think that it's important for data scientists I mean 1 thing that I've always been really interested in and encouraged people that I work with to do is to learn as much as possible about things like devops infrastructure the way that software engineering happens in terms of like cicd. Horizontal scaling dockerization containerization like the more of this I guess this started because so my first data science job was in a very small early stage company. So it was like me and then the devops person and anything that I could do to unblock myself was like less time that I spent waiting for this guy to not be.
10:53.28
Jason
Trying to be the work of like half a dozen people you know so it's really practically important to be able to unblock yourself and then that scales even in larger organizations as a data scientist because your job is frequently to again like observe what's going on empirically speaking and then understand what does that mean? what can we do about it and how will that change things. Are those changes worth pursuing et cetera.
11:16.63
James
Yeah, like informed engineering based on on data. Yeah, so just for the listeners I've I've had an opportunity to work with Jasonson before I feel like we're doing something.
11:18.29
Jason
Um, yeah.
11:30.60
James
Somewhat transformative to our engineering team as a result of that time. together unfortunately time was cut a little bit short but I think in essence we were working to establish a relationship between engineering and Data Science I think I've talked to a lot of people and my general understanding is that. There's typically a rift between engineering teams and Data Science I want to know why you think that is.
11:53.16
Jason
Yeah that's so interesting I mean I think for like broadly speaking I think group dynamics are such that you know there's this idea called again this is like an old idea I think. And I think the guy's name was Conway Conway's law not the cellular automata guy but another guy which says that like software interfaces software interfaces are basically implementations of the communications interfaces that exist between groups of people you know. So if you have like 2 ends. Of 2 different services and they're supposed to communicate. Basically the api is only as good as your ability to communicate about what should the Api look like and how should it be structured and how should it work and do we have some sort of like you know, enforcement or whatever but communication requires communication I guess and so software communication. It's like frequently. The. Catalyz and force like the forcing mechanism between engineering and data science teams like we have a model think of it as a binary we need to deploy it so that it actually does something in a system that requires a data scientist to talk to an engineer I think just like nuts and boltswise those kinds of structured communications aren't usually prioritized in organizations. Because everybody's already so busy with like more meetings than you know they probably need and those kinds of touch points just kind of go away and tropically unless you're proactive about keeping them alive.
13:23.69
James
And yeah, that makes complete sense just the degregation of of communication via degregation and communication. Um, what do you think? the best way is to to bridge that Gap I mean probably the answer is. You know, actually having conversations. But what on top of that.
13:45.82
Jason
Um, I think that actually having conversations helps out I think that it's additionally challenging honestly like with distributed globally distributed in many cases teams. That's just an extra sort of like in tropic force that you have to work against to. Perform relationship hygiene like group and team hygiene I think it's really important I guess it's the function of the leader to like go about that. It's almost like a garbage collection mechanism and keep the environments. You know, safe and secure for productive happy work to happen. Um, but I think in addition, like I mean 1.
14:20.55
Jason
Sneaky way that I've seen work that I really just copied and pasted from being a graduate student was this idea of journal clubs that are like did you ever come to one of these. They're totally like optional monthly usually cross-f functionctional team meetings cross-functional depending on who shows up. Like you know in the datata science team certainly this is just like metri is what always happens I'm reading about stuff I post a bunch of links and the team votes on what paper that we should all like read jointly and then get together and talk about it again like typically once a month totally opt in. But it's like this is meant to mimic the experience of being being a graduate student is basically what you do so anyway, like long story short this gives people an opportunity to sort of rub elbows even in a distributed remote setting and say like here's this thing that I did that was like. Maybe that paper is to do with work or maybe it's about like basketball because somebody chose a basketball paper but like I thought this was cool I thought this was confusing whatever it helps to sharpen hard skills. It helps to sharpen soft skills. It helps to build those cross-functional ties. This is just one sort of. Move, but it's a move that has been popular in my experience.
15:34.65
James
So yeah, it's I think it's potentially a really good way to to sort of like cross-train or actually spark interest among engineers. So yeah, that's that's that's pretty neat. Do you What exact resource are you using to like poll through papers?
15:54.89
Jason
Um, I guess the answer to the question like what did I actually get out of graduate school is the same as what resource do I use for scrolling through papers. It's like. Graduate school for me was about learning how to learn like I was in different fields. They were all pretty quantitative. None of it was about data science or really like computers directly but it was all about using math and using data and coming up in informs conclusions and figuring out like in a really practical sense. How to understand the nature of the problem that you're facing and what to do about it. You know so the approach that I take is like there's there's a zillion like there's the firehousese of inputs coming at you through Linkedin or Twitter or various blogs or like any kind of you know, social media or other input. Um, and you have to kind of know how to like skim what's useful off the top and understand once you have like a subject that you want to learn about how do you like? What are the tools. How do you start like researching a problem. How do you move from. The like information gathering like setting the scene stage joke now I want to go like I have an organization sort of with topics and a hierarchy of ideas and but I want to like organize them like this and go deep on that et cetera like building a mental model of these subjects. That's basically what you do when you're studying my physics and math.
17:23.87
Jason
And it translates really directly I mean I think this is what I like about Data Science. It's a little bit like being in well I liked it because it was the average really between a career in academics or you know something like strictly commercially focused. Where you have the opportunity to be creative and be like intellectually curious that still apply your skills to practical problems.
17:45.33
James
So your background is mainly in mathematics Statistics you know, general Data Science foundations right?
17:58.54
Jason
Um, yeah physics as well. Which is interesting because well that's another story. Um, but yeah physics as well.
18:03.23
James
Okay, yeah, it It sparks a question for me I I think that a lot of engineers especially now are getting more interested in Ml ai but it does seem like these are 2 different tracks to begin with. Like you either need a background in hard Stats. You know in your case Physics. or you're working directly with computers and you're learning computer science foundations like what are your thoughts on that right.
18:37.23
Jason
um I think I think that most of the people that I've met that are really good. Data scientists have come at it from sort of a quirky angle I think it's like kind of self not self. It's sort of selective in that. Especially before you know it's also a pretty recent field like branded degree programs that say like data science are pretty recent development so before that it was like necessarily interdisciplinary. You typically came at it from some other angle because there was no like straight path there. I think it's you know there are obstacles or opportunities regardless of where you start if you start from pure math. You have to figure out how to go from pencil and paper to like a command line. It's a big leap if you start from pure engineering you have to figure out what is the deal like what is a tensor and some of that can be difficult to digest. But again, it's just a an exercise in I mean identifying tools and then using them and this is like speaking of transformative and like things that keep happening faster and faster. The. A prolific amount of information that's like available on the internet and now like available to be synthesized through language model interface is such a huge huge opportunity for anybody who wants to learn anything about anything not to the extent that you like believe everything it says is gospel and like you know removes the need to work.
20:09.83
Jason
Or to like work toward understanding which it doesn't but if it can get you 80% of the way down the path towards like mostly anything that you know how to shape and to the right kind of problem to feed it like what could be more powerful. It's like a pretty transformative I think product.
20:25.23
James
So maybe this is a question for a language model. But I do have you here and I think you're better than a language model. How do you think an engineer could get into and ml and ai like are what kind of questions are they asking genative Ai.
20:43.13
Jason
How do you get into it. Yeah I mean again, it depends like if you're an engineer who has a lot of experience safe from like a ph d program reading cs papers that gives you a way to like into it if you're um. And engineer who has a lot of experience building and scaling like infrastructure and systems. Then I think that you can start to look at I mean and mlops like you were referring to sort of the stolen opportunity that you and I nearly had to build something that was starting to shape up and be pretty cool. But this gives I think an additional layer of transparency into like okay it helps to bridge the gap sort of between engineering and data science. So now you start to like see a little bit more behind. You know the curtain. Okay I sort of understand what's going on at least um, you know I know something about what model services need from an engineering perspective. But if you want like a really like black and white line here's a good version of it look at the. For example, if you know something about Kubernetes look at the interface that a tool like caser implements. It basically exposes I think a difference like a new crd for Kubernetes that like is all defined in yaml. And it basically describes everything the model service needs to like exist and deploy successfully handle requested responses etc if that's the you know if you're used to things that look like that start there and say okay now I know like a working definition of a model is something that has this behavior has maybe these attributes.
22:13.20
Jason
And like these sorts of you know these nouns and verbs are associated with it now pick your favorite like modeling problem like say a language model or something if you were going to deploy one and you know that as implement that interface like you're kind of backing into the pencil and paper stuff incrementally step by step and you say okay like in order to generate those. You know outputs to populate that kind of interface. It's doing stuff like it's making predictions and then you kind of like proceed from there I mean I could go on forever. But I think that you basically you just like follow these crowds. Um, it's basically again, the whole experience of being a student of science if you like like a graduate student. It's mostly about. Slow progress and being frustrated and being like totally confused and not understanding. What's going on and then you have these like sort of jumps or steps where like oh I thought this was like 30 different things and it's really 1 thing and that sort of enables you to go into the next step etc.
23:06.32
James
And I love that approach I mean I think there's a natural curiosity in all of engineering and I think starting for more your ad if it's an infrastructure you know, figuring out how to deploy a model that's already built.
23:16.33
Jason
Um, yeah.
23:23.29
James
Might be a good step in the right direction right.
23:24.23
Jason
Yeah, and practically speaking now that is I don't want to say it's certainly not trivial but like it's so much easier because pre-trained models are again like now readily available. You could literally I think just like grab a binary. of like a model will helpb like hugging face and then figure out a way to wrap it in Caserb and say okay, my service is on like it's giving heartbeats. This thing is officially running and then you're like a a lot further down the path than you were you know.
23:56.44
James
Yeah I love that hugging face is a neat place to be Um, so let's talk a little bit about what is exciting you about machine learning now these days. Um.
24:01.74
Jason
Um, cool product too I mean it's It's pretty neat.
24:14.32
James
What are some exciting use cases you've seen or are seeing now in the general wild of machine learning.
24:22.82
Jason
Yeah I mean the the huge elephants elephant is an understatement but like the the huge elephant in the conversation is obviously language models. I think we're pretty close to the top of the hype cycle with respect to language models like they're clearly a big deal I don't know if they're going to.
24:41.45
Jason
Replace all of us you know now are related but they I can see like for various reasons I think it's not just going to go away I mean it's not a fad more than like neural nets or a fad.
24:42.74
James
I.
24:59.60
Jason
Or any other number of examples that sometimes or even data science itself like as a brand name quote unquote that has this like hazy definition at first like if huge companies changing the name of like a huge swath of jobs to data scientists and what does that mean about this idea or this language. I Think the hype cycle kind of has a familiar shape because we have subjective responses like this is exciting and then at some point inevitably, you're like wait a minute it doesn't do like the universe of everything that I thought it was going to do so now I hate it and I have like a really negative opinion. But the the truth is somewhere in the middle and I think that we've yet to really find the bottom or Equilibrium I think it's naturally important to understand how to use these tools and like do stuff with them and they're not I mean Transformers For example, aren't just useful for language processing. But for other applications as Well. Including you know Multi-mole applications or any other number of applications I Think the reason that it's a powerful idea is again because it's so generic like it scales like across Mills literally you can do text to image like that's not a property of any other kind of Model. So This is certainly here to say what it means I think is still tbd. But it's. Definitely like where the action is heading but then again like there is this incredible paper I think I think incredible paper about a week ago showing that so we were talking before about like the difference between machine learning and intelligence and the notion of this like black boxy architecture which.
26:31.62
Jason
Formally speaking is really the representation of the inputs that the model creates rather than make predictions and that representation is again like more formally speaking and embedding in a vector space meaning any kind of input now corresponds to a vector or a highdimensional vector which is a tensor. Lives in some sort of like space a vector space that has really strong clearly defined mathematical properties and when you have something else like a distance function that works on that metric space now you can measure distances and do all kinds of things make mathematical conclusions so they showed that if you. Take that same thinking about representations remove the idea of the model remove the idea of training remove the idea of machine learning in general and just say how can I form a representation so like can be something in vector space for cheap basically and they found that by using g to do this. You can achieve dey plus the state of the art performance on text classification tasks without any need for training or hundreds of millions of parameters or anything so this really like what does it mean practically speaking like totally open question. But. This is really I think fascinating like really out of the box. You know, kind of this is like the physics the fan of physics inside of me like it fundamentally changes. It's kind of like seeing unexpected results in a collider like nobody was looking for gzip to compete with.
28:03.16
Jason
Burch or like some hundreds and billions of parameters in a model nobody thought that was going to happen. But now it's jumping out and like giving empirical signals. What thell does that mean like that's a really fascinating question from the perspective of thinking about like representation what that means like is all of machine learning really just about representation instead of inference and prediction and. It sort of like flips. The whole thing inside out. You can get lost there but I guess the function of a data scientist is then to map that to roi like how do you take that and then take like the set of problems that your organization faces and understand what to do and how to do it.
28:37.64
James
So yeah I I just to give you a view on somebody who's not like extremely versed in data science in the. Somewhat early days of DALLE and GPT I immediately thought of a use case for it and that you could ask Gpt basically give it a few keywords and have it generate a picture book or at least like an outline for picture picture book. And then have each line of text basically ran through Dali to create an image for that statement of text but it has it has less ties to mathematics and ah.
29:18.33
Jason
Um, yeah, totally you could definitely do that.
29:30.18
James
Like actual ROI and just maybe and naturally curious.
29:32.12
Jason
Um, yeah, that's really true I mean that's the fascinating thing about these sorts of new. Well just like new models. New techniques is that they're so visceral like they create this generative experience where like it's created. It's like being created in front of you. It's different again than.
29:51.39
Jason
You know something like even deep learning where it's like something happens and then it makes a prediction.
29:56.57
James
I feel like I have the shock factor of Generative Ai still that hits me every time I use either medium. Ah, you know you ask it a question. Um, any of the like large language models that are hosted and out there right now.
30:00.11
Jason
Um, yeah.
30:15.60
James
And I feel like I still get that shock factor and images too. You know, I feel like sometimes it's really good at things. Sometimes it's really bad at things like generative Ai for image is. For some reason really bad at understanding how humans eat spaghetti. Um.
30:37.58
Jason
Um, I Wonder if that's um, it's showing that people usually avoid being photographed doing So it's underrepresented in the training center or something that's a blind. Guess.
30:43.25
James
Oh man that that has to be it.
30:53.50
James
Ah, so I've seen some of your showcase models I mean I think if people look you up. They're going to find your your site with some showcase models that you have with computer vision like large language models any of them. You want to talk through that excite you.
31:11.84
Jason
Um, wow I mean they're all, um, they are all things that I pursued because I was just like fascinated by the implications of these tools you know trying to take the opportunity to just see what the thing is about like can I cook something up. It's sort of in like a hackathon of one like I'm nuts. These don't take a lot like weeks and weeks and months to produce but just like if I screw around with this can I kind of figure it out. Yeah I mean thanks for mentioning and I'd be really excited for anyone to take a look and let me know what you think there are some cool things that I've tried out with mostly with like transformers and pre-trained models. In different contexts language you mentioned is one I started out just going like a really dumb little chat bot I mean a lot of these things are effectively hardware constraints like in that little virtual environment that you get in hugging face space to run an app. You know you only have so much as. Compute and and memory and stuff. So the model is like not as good as as the real deal but it was sort of like we were talking about just grabbing a model and like wrapping it in case or I would say and serving it and saying like this thing is running is a little bit like you know the same move like backing into understanding what to do and how to do it with these tools. And then the thing that transformed into text to Sql which was like speaking of the wow factor and the shock factor like when I got that thing to work. It's like so simple and I was like this is really doing what it looks like it's cheering like it's instantiating that whole sql like database you know through the docker file and setting all that stuff up and the text is actually turning into sql and it's actually running.
32:46.15
Jason
It's pretty neat and only a couple lines of code. There's a computer vision thing that's like satellite image processing that I think it's just like a cool application of images satellite images are really really versatile and like a hugely important data set I think and then even time series like that one I think was sort of Exploratory. Maybe. From the authors who originally proposed the approach we're guessing again were like applying transformers to time series and showing that it's really effective against like basically I mean time Series forecasting is such a huge field.. There's so many different ways to skin that cat and if you try Transformers. Which is conceptually a lot simpler than some of these others. It's again, like pretty much performs at the state of the art even with relatively resource constrained environments. Um, another thing that powers a lot of those things is Gpu processing like speaking of resource constraint you know because much of the. Computation has to be on those accelerated sort of optimized chips and so it's been I mean sort of just like playing around like an excuse for me to play with these things and see what I can get them to do. Um, yeah, which is fun I need the certainly the fun parts.
34:00.13
James
I I love that you still getting in the shock factor. Even though you're so deep into this field. Um, and that that gives me hope that there's still so much to be discovered about all these things right.
34:00.21
Jason
Ah, working in the field.
34:07.65
Jason
Um.
34:12.26
Jason
Oh yeah I mean I think that that is like to really get on the science soapbox like that's that's why people like science because you discover things you know it's like I was saying before the experience is mostly 1 of being stuck. But then every once in a while you get unstuck and you're like oh shit you know I now understand things in like this totally different way. That's. Something that I think is easy to enjoy when you experience that.
34:33.94
James
And so I've been in tech for well over thirteen years now and I've seen it slowly stagnate and that like you know we have affinity towards frameworks we have affinity towards backends the way we do things. And I feel like it's slowly reaching as you would call like max enropy I don't I feel like Ai is a big change up like yeah.
35:01.76
Jason
Oh you know engineering is about building systems that scale and if the use case is something like log processing then you don't have to get too much more complicated than map and reduce. But if the use case is a language model. Or deploying and serving like a portfolio of models for various reasons. Um, then now you need the system that can support that use case and that's where um, all of that like you know infrastructure knowledge and really really deep like knowledge comes into play.
35:38.41
James
So I want to switch gears a little bit and maybe this is a dreaded topic but I feel like data has driven interesting results for applications like Facebook Instagram Tiktok. Um. And the other day I talked to somebody really passionate about technology ethics. How do you feel like ethics are playing into data-driven development I mean that we have metrics now. At least from these companies that are dead focused on capturing every bit of user attention and.
36:15.90
Jason
Yeah that's definitely true. I think as a consumer the most important thing is to think about to be thoughtful about like what data you're sharing and to what end you know. I think that a lot of people were really talking about the wow factor a lot of people were really hit with a product d wow factor like ten or fifteen years ago when there's all, there's now all these like digital products in there free like that's incredible. You know that's that's totally different than any other consumer experience ever. But that also I mean. There's no free lunch. You know that also means that somewhere along the line something is productized I think as a consumer, it's just important to be mindful of these sorts of things. and people vary unlike do you want to block cookies or like limit the amount of activity that you have on social media etc. But at least if you're thinking about that. As a consumer I think that's important as that's on the demand side on the supply side like as somebody building machine learning services I think you have to be really laser focused on is the product that you're building or is the decision that you're helping to optimize. Really in the best interest of the organization and its customers. Um, and if not or at least in order to hedge against the possibility to build in enough monitoring and transparency in the model. So the decisions about you know, pretty big subjects like what is right? and wrong aren't just being made.
37:45.42
Jason
In the python code that belongs to like 1 person but rather at an organizational level so you know these kinds of decisions are democratized and you know scaled horizontally basically across not maybe not everybody in an organization but across like the group of people whose responsibility. It is to own that decision. Another thing I've been reading about recently which is sort of like this will come back on topic but like the notion of data governance which is becoming important to many organizations kind of along the same lines mostly about hedging regulatory risk but like in addition, thinking through ideas or ethics etc at like an organizational strategic level. It's mostly about um.
38:24.50
Jason
Change management really like managing the fact that this is an organizational muscle that doesn't get a lot of exercise and thinking about things in data in this coordinated way is maybe even like counter to a lot of ingrained practices in some organizations but it's required by the responsibilities. But like having all that data and being able to use that data places on an organization certainly is someone that's like actually building. You know the thing I think it's really that would serve any data science team well by scaling out that decision making framework and scaling out those responsibilities and creating that organizational transparency so that it's not all. Funneling through like 1 person who may may not even realize the scope of that kind of thing you know.
39:06.00
James
So Yeah I think that's it's very thoughtful. Um, what about on the ai side like I think it's been all over the news as of late companies like openai I facing lawsuits scraping data. I Think. They're all sort of based on building models from what people perceive to be. You know I think we all perceive it to be at other people's work What you read on that situation.
39:31.81
Jason
Yeah, yeah, that's a complicated topic I mean I think um, we're seeing some of the leading indicators maybe of like how Ai affects the economy broadly speaking. And how it shakes out is completely unclear I mean as unclear to me as it is to any other person. There's just so much. That's like not knowable and in addition, like what should be the rules again I think that's the kind of question that like requires a certain amount of putting heads together to really understand what are the. Important motivatations.
40:06.14
James
What about on the generative Ai in the way of like image medium. I'll speak to my wife is a children's book illustrator and I honestly I think she doesn't have many worries about generative Ai.
40:18.32
Jason
Oh.
40:25.32
James
Um, but I know that many are are worried about that in her field. about you know, Ai stealing art styles or you know building upon art styles. Um, and then redistributing that work. Um.
40:41.62
Jason
Um, yeah, yeah I think that it um, it really remains to be seen like cow washes out I assume that you know there was this it remains a gray area I Guess but like there was this whole effort or like. Leveling up in terms of consciousness at things like digital rights protection with like online music and stuff and I think it's going to. That's obviously like just an intermediate stop on the way to understanding like what you know how the economy digests these kinds of competing forces. But I think something like that will probably emerge where like. You know it's an issue I mean you have strikes you have people complaining and protesting rightly so but in addition, like the the actual practical implications like we haven't really seen where it shakes out. It's just like so I know awesome like in in a really literal sense like it's really an. Unclear. There's so much uncertainty about what actually happens who knows.
41:39.51
James
So There's I think there's a numerous amount of industries being dare I say Disruptive or disrupted by Ai. What do you think is the next industry to be disrupted I mean there's. There's art right now there's you know content right in numerous other things but is it like you know, generative music?
42:06.97
Jason
Um, yeah I mean I think I saw something just yesterday and the day before that was supposed to be a recording. Maybe it's between like a real customer and um. Ah, customer showing this agent who is actually a language model but like you know an audio model having like a real sales conversation or customer service interaction like is that real or not I don't know I didn't make it but it's up there like at at least that's something that somebody is thinking about. I don't know some will try it certainly? Well I'm sure they already have is it any good like that's again, like your mileage be vary. my guess is that it's not going to replace music as we know it but things like computer generated I don't know like. Images that show like when you're in an elevator things like new new types of screen savers or something like that I don't know like there's so many different corners that you can crawl into trying to think through what's going to happen from the miniature like screen savers which is admittedly not a real example. To things like people thinking this is going to wipe out the entertainment industry like there's it's not a hype cycle but there's a similar shape like there's there's a peak and there's a trough in the truth is somewhere in the middle where we are on you know on that ride I have no idea. But 1 thing that I think is.
43:27.65
Jason
Ah, given is that it's going to keep on changing.
43:30.93
James
And and speaking to the the extreme of the trough a few years ago computer file I think it is on on Youtube released a video about how Ai can destroy the world I think their example was something like an Ai. Driven letter sender like a mailer ended up turning humans into stamps or something like that. Um, so the question is do you think that ai could destroy the world like are we all in danger is it like six months to
43:53.52
Jason
Um.
43:59.47
Jason
Six months it would be pretty hard to pin down. I do know for a fact that that's basically what the new mission impossible movie is about which is a pretty fun watch if you're into ai straight in the world. There's this. Yeah, Ai bad guy called the entity and afterwards I told my wife the entity is probably written in python you know which I think I think is probably true. But anyway like people are certainly thinking through these things is that I got to destroy the world in six months I I would guess no um, but. If it was yes then the data set I'm using to make that prediction would be biased so I wouldn't know in the first place. So I mean like data doesn't remove uncertainty. It could just help you to manage it. These are the kinds of things that are pretty difficult to say.
44:47.45
James
So Alright man I got to switch gears a little bit and I hate to talk to you about something I've been talking with other guests about and it's negative news in Tech primarily massive Layoffs Meta Wherever? what's your take on that. You.
45:06.60
Jason
Um, yeah, it's interest rates are no joke I think people are discovering this in many different shapes and forms I think that's like the that's the long story short I think depending on how risky and how optimistic. Um, and how aggressively they're managing the stock price. Maybe your company is these these effects are transmitted right down the chain to you know at this stage I think probably hundreds of thousands of people across the technology economy. It's a huge.. It's a huge deal.
45:44.33
Jason
And again speaking of things taking the time to work through the system like interest rates monetary policy that operates over long spans of time. So I think that the results are Tv but we can certainly see the effect of immune time. I was affected and.
46:02.83
Jason
Think that it's changed the nature really of like looking for a job and will probably change what those jobs are like you know a couple months down the line a year from now just because of the like the magnitude of its impact on the whole on the whole sector.
46:21.94
James
Say you're you know, laid off looking for work I think you know listeners of this podcast. There's a definite possibility somebody listening has been laid off. maybe now is the opportunity for them to get into data science. Um, Mlai like what do you? Think's the best way for them to get there.
46:43.55
Jason
Um, oh yeah, I think the best way is to pick something. That's you're interested in ideally a kind of problem I mean maybe many people would say well language model is right? What I'm interested in which I'd say arguably is a solution rather than a problem. And it's very easy to be motivated by like the how exciting a particular solution is and also important to understand how to use those techniques but the best way to get into data science is to pick a problem that you really care about or like a dataset that you really care about like I mentioned reading a paper about like basketball data or if you love like any any other kind of. Fields that generates data pick a data set or a problem you care about and then understand what kinds of questions you can answer on that data set up or what kinds of techniques you can use to answer that problem, etc. Etc. I used to say put it on Github now would say maybe like put it on hunting face and make a stream that it's really easy. You can even put it in a google coab if you want to use a gpu. Still for free also pretty easy. Um, so I would say like just start doing stuff and you know putting it up like sticking it to the wall and talking about it because probably people will find it interesting and you know we were talking about like generating those opportunities for communication generating those interactions. It's a kind of thing that generates those interactions until they collect into like a you know community I suppose.
48:04.25
James
And I love it. So I always ask a food related question to wrap up. So I hope you're ready. Ah, if you were reincarnated as a food. What food would you be.
48:10.45
Jason
Um, okay.
48:18.80
Jason
Oh that's a good 1 reincarnated as food. boy that's cheeky I'm gonna say.
48:34.60
Jason
Um, I'm going to say like a really excellent pasta with red sauce. But yeah I'm standing on that.
48:40.61
James
Yes, Okay, all right.
48:47.85
James
All right anything we haven't touched on anything you want to mention shout out Jason.
48:54.75
Jason
Um, well thanks again. James this has been fun and I appreciate you having me on and congratulations on being a podcaster that's amazing.
49:02.88
James
Thanks again. Jason for joining me and thank you for listening to the James from Montana Podcast
49:19.80
James
If you want to support this production or see more content like this visit jamesfrommontana.com consider signing up as a member I'll have links to anything we talked about in this podcast in the description and also Jason sites who you can maybe see some of the stuff. He's been working on.
49:37.54
Jason
Thank you James.
Podcast Episode 3: AI, Data Science, & ML ~ Jason Dolatshahi
In this episode, I’m joined by Jason Dolatshahi, a thought leader in data science. Listen in as we talk about everything from AI destroying the world and the tech recession to how engineers can bolster their knowledge of ML/AI and make a career transition.