Tuesday, November 22, 2022

Lots of annoying admin stuff, but also testing out some custom calculations

 I’ve been learning more than I ever wanted to know about login-based license management for Tableau Server, yet I still feel like I don’t have any answers about why some errors keep persisting. How do you have literally pages and pages of documentation with so little actual detail in some spots? There’s pages of explanations on tokens, registry entries, commands to run, etc, but to me it looks like it’s all just repeating the same broad strokes without helping if you run into something truly off-label.

This will mean nothing to people who don’t know the product. I’m sorry. I have no more spoons to explain it, and I also don’t think I know anything useful; I certainly don’t know anything more than I did three weeks ago, which is truly sad considering how much I’ve researched it. The long and short is, we converted Tableau desktop licenses over to the server, and the activation and maintenance of permission to have the license happens via authentication to the server, not through separate product keys. And I keep getting an error that I’ve installed too many times for my license, which is not actually true.

Maybe someday I’ll get to go back to actually working on data modeling and building visuals.

Side note: I did build out a couple of data lenses and got to explore that a bit more. It’s an interesting feature - It allows for ad-hoc exploration of the data, but it’s a bit hard to understand what to do. I like a natural language query better, but considering it didn’t take too long to set this up, it’s not bad.

In other news: I’ve had a request to add some content to our enrollment reporting dashboard (in Pyramid Analytics).

This is exciting because my customer is asking for a view of data that’s based on a set of spreadsheets they’ve been maintaining for a while, which is using data that is already in our model. I’ve been looking for an opportunity to build something more mature for that data, but until now, didn’t have a clear direction on what measures and visuals were going to be the most useful.

Today I spent some time working up custom calculations in Pyramid that shows the figures they want, with the dimensions they need. I’ll be using my tested custom lists and measures to show our developers what we want added to the cube officially. We can do it ourselves, but we have a contract with a vendor to do things like this, which will save us a lot of time.

We have a set of dimensions and measures that report on student applications for admission. The main dimension is application status, and there are a bunch of measures for counting applications and showing how, for example, a particular status, compares to other statuses, or what our application or admit yield is, etc. It turns out there is some overlap between the “canceled” and “admitted” statuses; there are two groups of canceled apps: those who had an application decision (admitted or denied) and then canceled their app, and those who didn’t have a decision before they canceled their app. My customer has been breaking that data apart manually, and then using logic based on that to split the measures out into two groups, and, in some cases, conditionally selecting to display a particular measure for just one group or the other. We care about deferred applications when we look at overall applications measures, but we don’t want that displayed for the original term’s admits, for example, and we want to see fixed calculations on canceled apps along the accepted/admitted dimension. That particular member, or level, doesn’t exist in the “status” hierarchy yet, but it soon will. It may be that this has revealed an issue with the way the statuses and decision codes are transformed; there is a dimension that’s basically a flag to indicate if a student was accepted, but it doesn’t have much impact on the numbers, and it’s not very detailed (the member names are “Yes” and “All” 🙄). There’s also a dimension indicating if the transaction we are looking at is the “final” transaction for an app, which is worth exploring; as far as I know, the application status should default to showing the most recent effective status already. By the way, if it sounds like I don’t know much about some of the dimensions, that’s right - this data warehouse has been in place for 6+ years and I’ve been working with just some of its perspectives for about three years, though not full time. I also write queries and reports on two other platforms, and do a lot of special reporting projects, so when I get to learn something new about a dimension or measure, it’s usually because of a specific request. I don’t have a lot of time for exploration that doesn’t have a developed use case.

I made good progress towards a project that should eliminate some manual data manipulation and help us further direct our decision-makers towards faster automated self-service analytics for their decision support. I feel pretty good about that. 😃

(Now, anyway. If you’d asked me six hours ago I might have been a little more frustrated than satisfied. 😅 It is always thus when in the throes of development and testing.)

Friday, November 11, 2022

This week I am learning: recursive queries, data architecture, relationship building

 https://habr.com/en/company/postgrespro/blog/490228/

This is something I am interested in learning more about, particularly in SQL using WITHs, because it’s not something I’ve seen done much. I’ve learned (Oracle) SQL partly through formal training, but more practically through collaborating with colleagues and reviewing/editing/tuning queries that are part of my organization’s asset library (reports we already have published, views that I peek at under the hood and use as examples for how to accomplish something in another context, etc). That has been very helpful, but it’s also a hindrance in the case where there is a culture or habit of doing something a certain way, and you don’t get exposure to other methods, or if there isn’t much need for a particular technique. For example, I’ve taught myself to use ANSII join syntax outside my organization’s community of practice because mostly the older style is still used there, but I like the way the newer method separates the join conditions from the others.

So recursive queries using WITHs, and actually just using WITHs (common table expressions) in general is not something I see a lot in my work. Looping through a set is usually accomplished using PL/SQL or with some other procedure tool. I haven’t learned PL/SQL in my position for a few reasons, though I am thinking I should learn it, at least in part because it could be a good way to introduce myself more fully to procedural techniques in a familiar environment. I have reviewed a lot of our function packages and understand the basics, Plus, I have done some beginner’s lessons in Python, so the concepts of doing basic functions have taken root, so to speak. Though I recognize that Python is an object oriented language. Adding this here as something I’m adding to my reading list: https://www.cs.rpi.edu/~sibel/csci4380/fall2019/course_notes/sql_procedural.html

When I first started my job, years ago, the only thing I needed to understand about SQL was how to use it to create reports for our staff and faculty to use in their daily operations. Much more of my job was working with customers directly to understand what they needed and how they could effectively use the data we provided. Exception reports, pulling batches of records for reviewing and processing records, or contacting students, or loading flat files to other services or processes - that was mostly it. Querying other databases or doing more complex work was left to our developers, who essentially did “integration” stuff and advanced programming, and if we needed data to be queried alongside our Oracle data (the system of truth), the answer was often to bring a dataset into that environment in a custom table using a cron. I don’t thing that’s a bad thing, but the tools at our disposal are changing, and have been for a while. It’s now possible for someone like me - not a trained programmer, but someone who develops queries and models using SQL and application tools - to connect to a variety of sources or even use APIs to pull data into a model in seconds. That’s opened up a lot of possibilities for us to build up flexibility and capacity for meeting evolving analytics needs, but it also exposes a skill gap. I’m facing the reality that I’ve got a lot of options I don’t know how to use - or how I know to use it would be less efficient, less effective, and just more simplistic than the way an experienced programmer might approach the same thing.

Back to recursive queries with ctes - it may be that there just isn’t a lot of justification to use that technique in my current situation, which is why I haven’t seen it. Knowing that may be the case, I don’t know how much time I should even spend on learning it. And if it’s just that I’m interested in more procedural methods, or even in expanding my skills to meet current demands, does that mean I should focus elsewhere? Right now it feels like there is a lot I don’t know, and I’m not sure where I should start. I want to consider application of skills to my current job - “how will I use this in my work in the near term?” I also want to invest in the most beneficial learning plan from an objective standpoint - “regardless of where my organization is at the moment, what’s the best plan of study for leveling up generally in data science and analysis?”

Further, I’ve also got needs to meet in terms of application and platform management, as well as learning about and planning to build data architecture that supports strategic goals.

The bright side is that recently I have had some very encouraging opportunities to connect with people who have expertise in some of these areas, within as well as without my organization. Even better, when I’ve anticipated an unwelcoming response, I’ve often been pleasantly surprised. I have walked away from a lot of conversations recently feeling even more convinced of how much I don’t know, but I’ve also felt, at least some of the time, that some of the people who know more than I do have extended an invitation to me to come back and continue the conversation - they aren’t laughing behind their hands at my mistaken assumptions or blank spots. 

Sunday, November 6, 2022

I might want to learn R? Also imposter syndrome. Also DST ends.

TIL that my dog doesn’t care about the ending of Daylight Savings Time.

She really wanted breakfast at the same old time and she made that known today.

Which is great for a dog who, last year, was in the middle of an arduous auto-immune disorder diagnosis and treatment process and wasn’t always very interested in food. But it was also not great in that we need to adjust to the new times, not just insist on doing everything an hour earlier.

Today I also learned that I might want to try expanding (again) into the programming languages of data science in addition to my Happy Little SQL queries and low-code BI viz apps. Of course, I’m already trying to talk myself out of it. It’s too hard, too overwhelming, etc etc. What if I suck at it? Didn’t I try teaching myself Python last year but ghosted my lessons as soon as life got a little too busy? What if I’m actually stupid, I’ve just been deep faking my entire professional career, and this one thing is going to topple my carefully built castle of lies and bullshit? 🙄

Lots of annoying admin stuff, but also testing out some custom calculations

 I’ve been learning more than I ever wanted to know about login-based license management for Tableau Server, yet I still feel like I don’t h...