Feed aggregator

SPM and GTTs

Tom Kyte - 2 hours 12 min ago
Howdy, I'm wondering about how SPM and things like https://blogs.oracle.com/optimizer/post/what-is-add-verified-spm would be impacted by the presence of global temporary tables within the query(s). I've been looking for documentation that would outline how SQL plan management would behave when dealing with queries relying on GTTs but I haven't had any luck so far. Basically I'm curious how reliably baselines, evolving, etc can/do work when dealing with queries that could have wildly different data sets based on GTTs within the query. Cheers,
Categories: DBA Blogs

Pickleball Übung: Drop Spiel 7-11

The Oracle Instructor - Mon, 2024-03-18 11:52

Einer der wichtigsten Schläge im Pickleball ist der 3rd Shot Drop – also der dritte Schlag einer Rally, wo das aufschlagende Team mittels eines kurzen Balls in die NVZ nach vorn kommen will.

Leider ist das auch ein ziemlich schwieriger Ball, weshalb er häufig geübt werden sollte.

Wenn es euch geht wie mir, findet ihr Spiele um Punkte viel spannender als Übungen. Darum hab ich mir dieses Spiel ausgedacht.

Es geht mit vier, drei oder sogar nur zwei Teilnehmern. Die Beschreibung ist für vier Spieler.

Spieler A und B sollen den Drop Shot üben. Sie stehen so, wie im normalen Spiel das aufschlagende Team vor dem 3. Schlag steht. Spieler C und D stehen so, wie im normalen Spiel das rückschlagende Team nach dem Return steht – nämlich an der NVZ. Hier in der Übung starten C und D jede Rally. Zuerst spielt D einen langen Ball diagonal. A versucht einen Drop Shot. Anschließend rücken A und B nach vorn:

Je nachdem, wie gut der Drop Shot war, kommen sie gleich nach vorn oder rücken allmählich durch die Transition-Zone vor.

Das Spiel geht mit Rally-Scoring, also sowohl Team AB als auch Team CD können jederzeit Punkte machen. C und D starten abwechselnd die Rally. Ist also der erste Punkt ausgespielt, beginnt nun C:

Für C und D ist es etwas leichter, Punkte zu machen als für A und B. Darum gewinnen C und D mit 11 Punkten, während A und B schon mit 7 Punkten gewinnen.

Sind nur drei Spieler am Start, übt einer den Drop Shot. Er wechselt dabei jeweils die Seite. Die Gegner dürfen nur auf diese Seite spielen.

Bei zwei Spielern spielt man nur auf einer Hälfte des Platzes.

Das Spiel hat für beide Teams einen guten Übungseffekt, denn diese Schläge sind typischerweise die kritischen Schläge jedes Ballwechsels bei fortgeschrittenen Spielern – und man spielt/übt eben nur diese.

Durch das Scoring bleibt die Motivation hoch. Bei unseren bisherigen Drop Spielen hat sich gezeigt, dass nach relativ kurzer Zeit häufiger das Drop Team mit 7 Punkten gewinnt. Das ist aber auch ganz okay so, finde ich. Denn das gibt ja das Feedback, dass man es mit dem Drop Shot richtig macht.

Categories: DBA Blogs

Number Data type declaration with different length and performance impact

Tom Kyte - Mon, 2024-03-18 09:06
1. I have few number column with data type declared as Number, Number (5), Integer, Numeric(10). I know in few cases the maximum data is 2 digits and I see that is declared as Number(38)/ NUMBER / Numeric(30) /Integer if i don't declare as number(2), instead if i declare as ( Number(38)/ NUMBER / Numeric(30) /Integer) will there be any performance issue when I have a table with millions of records and that is used in updating the data or used in Where clause 2. Varchar2 I have a column with 1 character (Y/N) if i declare this as Varchar2(1 CHAR) instead of VARCHAR2(1 BYTE). Will there be any performance issue when we use this column in where condition for millions of records? 3. IS it advisable to use ANSI Datatypes in table declaration or always preferable to use Oracle Data types, will there be any performance issue? Please advise
Categories: DBA Blogs

Update Partition table from another Partition table Performance issue

Tom Kyte - Mon, 2024-03-18 09:06
Hi I am migrating from Sybase IQ to Oracle 19C. there are many updates happening from one or multiple tables. My Target_TBL Table has 18 Million records per partition and there are 1000's of Partitions. (Partitioned by VersionID). APP_ID is one of the another key column in this table. I have 10 Partitioned tables which are partitioned by APP_ID which has around 10 Million to 15 Million Records. I have 5 non-partitioned Lookup tables which are smaller in size. I have rewritten all the Update statements to Merge in Oracle 19C, all the updates happen for one VersionID only which is in the where clause, and I join the source table using APP_ID and other keycolumn to update 70 to 100% of the records in each updates 1. Target table has a different key column to update the table from partitioned Source tables which are 10 to 15 Million. i have to do this by 10 different Merge Statements 2. Target Tables have different key columns to update from Non-partitioned Lookup table , I have to do this 5 different merge statements In sybase IQ all the multiple updates are completed in 10 Minutes, in Oracle 19C it takes more than 5 hours. I have enabled parallel Query and Parallel DML also. A) Can you suggest a better way to handle these kind of updates B) In few places the explain plan shows (PDML disabled because single fragment or non partitioned table used) . C) I leave the large Source table updates to go with has join's D) I Force the Lookup source table updates to use Neste Loop. Is this good or Not ? E) if i need to use indexes, can i go with local/global Other key column reference for Lookup tables. Appreciate any other suggestions to handle these scenarios. example <code> Merge INTO Target_TBL USING SOURCE_A ON (SOURCE_A.APP_ID=Target_TBL.APP_ID and SOURCE_A.JOB_ID=Target_TBL.JOB_ID) When Matched then update set Target_TBL.email=SOURCE_A.email Where Target_TBL.VersionID = 100 and SOURCE_A.APP_ID = 9876; Merge INTO Target_TBL USING SECOND_B ON (SECOND_B.APP_ID=Target_TBL.APP_ID and SECOND_B.DEPT_ID=Target_TBL.DEPT_ID) When Matched then update set Target_TBL.salary=SECOND_B.salary Where Target_TBL.VersionID = 100 and SECOND_B.APP_ID = 9876; Merge INTO Target_TBL USING Lookup_C ON (Lookup_C.Country_ID=Lookup_C.Country_ID) When Matched then update set Target_TBL.Amount_LOCAL=Lookup_C.Amount_LOCAL Where Target_TBL.VersionID = 100; </code>
Categories: DBA Blogs

Gather STATS on Partitioned Table and Parallel for Partitioned Table

Tom Kyte - Mon, 2024-03-18 09:06
hi I have a Partitioned(List) table by a VERSION_ID, which has around 15 million per partition. We have daily partitioned ID created bulk insert for 15 Million rows with 500 columns and then have 10 updates(MERGE UPDATE) for multiple columns from multiple other tables. is it good to gather stats after insert once and then after multiple update once. What is good practice for performance in gather stats for these partitioned table scenarios's second question, when i use merge on partition table from other partioned table, i am seeing the below in explain plan when i use Parallel DML hint. PDML disabled because single fragment or non partitioned table used
Categories: DBA Blogs

DR setup involving replicated database

Tom Kyte - Mon, 2024-03-18 09:06
Howdy, The current set up I'm looking at is an OLTP production system running Oracle 19.20 (4 instance RAC) with active data guard. This system is seeding a data warehouse running Oracle 19.20 by way of Oracle GoldenGate via an integrated extract. At present the warehouse does not have a DR solution in place and that's the point of the post. I'm wondering what the best solution would be for a warehouse DR strategy when GoldenGate is in play like this. I assume data guard again but happy to hear other thoughts. The bulk of the questions I have involve the GoldenGate component. I'm not sure how that would need to be set up / configured in order to minimize the complexity in any role transitions from either the transactional or warehouse (or both); and what scenarios can be handled seamlessly and which would require manual intervention. Thanks a bunch! Cheers,
Categories: DBA Blogs

Locate an Error in Wrapped PL/SQL

Pete Finnigan - Mon, 2024-03-18 08:46
I had a conversation a few weeks ago with someone who asked me how to find a missing table when you have a wrapped PL/SQL file and cannot see the source code and you install it and it gives an....[Read More]

Posted by Pete On 18/03/24 At 01:00 PM

Categories: Security Blogs

FastAPI File Upload and Temporary Directory for Stateless API

Andrejus Baranovski - Sun, 2024-03-17 09:32
I explain how to handle file upload with FastAPI and how to process the file by using Python temporary directory. Files placed into temporary directory are automatically removed once request completes, this is very convenient for stateless API. 

 

If you think I’m geeky, you should meet my friend.

The Anti-Kyte - Sat, 2024-03-16 11:14

I’d like to talk about a very good friend of mine.
Whilst he’s much older than me ( 11 or 12 weeks at least), we do happen to share interests common to programmers of a certain vintage.

About a year ago, he became rather unwell.
Since then, whenever I’ve gone to visit, I’ve taken care to wear something that’s particular to our friendship and/or appropriately geeky.

At one point, when things were looking particularly dicey, I promised him, that whilst “Captain Scarlet” was already taken, if he came through he could pick any other colour he liked.
As a life-long Luton Town fan, his choice was somewhat inevitable.
So then, what follows – through the medium of Geeky T-shirts – is a portrait of my mate Simon The Indestructable Captain Orange…

When we first met, Windows 3.1 was still on everyone’s desktop and somewhat prone to hanging at inopportune moments. Therefore, we are fully aware of both the origins and continuing relevance of this particular pearl of wisdom :

Fortunately, none of the machines Simon was wired up to in the hospital seemed to be running any version of Windows so I thought he’d be reassured by this :

Whilst our first meeting did not take place on a World riding through space on the back of a Giant Turtle ( it was in fact, in Milton Keynes), Simon did earn my eternal gratitude by recommending the book Good Omens – which proved to be my gateway to Discworld.
The relevance of this next item of “Geek Chic” is that, when Simon later set up his own company, he decided that it should have a Latin motto.
In this, he was inspired by the crest of the Ankh-Morpork Assassins’ Guild :

His motto :

Nil codex sine Lucre

…which translates as …

No code without payment

From mottoes to something more akin to a mystic incantation, chanted whenever you’re faced with a seemingly intractable technical issue. Also, Simon likes this design so…

As we both know, there are 10 types of people – those who understand binary and those who don’t…

When confronted by something like this, I am able to recognise that the binary numbers are ASCII codes representing alphanumeric characters. However, I’ve got nothing on Simon, a one-time Assembler Programmer.
Whilst I’m mentally removing my shoes and socks in preparation to translate the message, desperately trying to remember the golden rule of binary maths ( don’t forget to carry the 1), he’ll just come straight out with the answer (“Geek”, in this case).

Saving the geekiest to last, I’m planning to dazzle with this on my next visit :

Techie nostalgia and a Star Wars reference all on the one t-shirt. I don’t think I can top that. Well, not for now anyway.

1, 2, 3 – Frei!

The Oracle Instructor - Sat, 2024-03-16 06:35

Pickleball Übung: 1, 2 , 3 – Frei!

Eine schöne Übung zum Aufwärmen, die auch gut für Einsteiger geeignet ist:

Alle vier Spieler stehen an der NVZ. Aufschlag und Zählweise ist wie beim normalen Spiel.

Die ersten drei Bälle inklusive des Aufschlags müssen in der NVZ aufkommen. Anschließend ist der Ball freigegeben für offensive Dinks, Speed-Ups und Lobs:

Beispiel: Spieler A beginnt mit dem Aufschlag diagonal, D dinkt (nicht zwingend) zu B und B spielt den dritten Ball in die Küche zu C. C spielt einen langen Ball in die Lücke.

Hintergrund: Wir haben die Übung bisher so ähnlich gespielt, aber mit 5 Bällen, die in die Küche gespielt werden müssen, bevor der Ball freigegeben wird.

Das hat in meinen Augen zwei Nachteile:

  1. Es lehrt die Teilnehmer die falsche Art von Dinks, nämlich harmlose „Dead Dinks“ in die Küche. Im ernsthaften Spiel geht es aber beim Dinken nicht in erster Linie darum, unbedingt in die Küche zu treffen. Ein Dink soll möglichst nicht angreifbar sein, aber möglichst unangenehm für den Gegner, damit der einen hohen Ball zurückspielt, den wir unsererseits angreifen können. Das kann durchaus auch ein Ball sein, der kurz hinter der NVZ aufspringt. Mit dem alten Übungsmodell ist das aber ein Fehler. Später hat man dann oft noch Schwierigkeiten, den Leuten die richtige Art von Dinks beizubringen.
  2. Man muss bis 5 die Bälle mitzählen. Das klappt oft nicht so gut, so dass man im Zweifel ist: Waren das jetzt schon 5?

Bei 1, 2, 3 – Frei! behält man leichter den Überblick. Trotzdem ist der Aufschläger (wie beim großen Spiel) etwas im Nachteil, denn die Rückschläger können zuerst einen offensiven Ball spielen. Eben zum Beispiel einen druckvollen Dink, kurz hinter die NVZ.

Categories: DBA Blogs

How to Create Urdu Hindi AI Model and Dataset from New Dataset

Pakistan's First Oracle Blog - Fri, 2024-03-15 21:54

 This video is hands on step-by-step tutorial to create a new dataset, an AI model, fine-tune the model on dataset and then push it to hugging face.




Code:

%%capture

import torch

major_version, minor_version = torch.cuda.get_device_capability()

# Must install separately since Colab has torch 2.2.1, which breaks packages

!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"

if major_version >= 8:

    # Use this for new GPUs like Ampere, Hopper GPUs (RTX 30xx, RTX 40xx, A100, H100, L40)

    !pip install --no-deps packaging ninja flash-attn xformers trl peft accelerate bitsandbytes

else:

    # Use this for older GPUs (V100, Tesla T4, RTX 20xx)

    !pip install --no-deps xformers trl peft accelerate bitsandbytes

pass


!pip install einops


from unsloth import FastLanguageModel

import torch

max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!

dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+

load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.


model, tokenizer = FastLanguageModel.from_pretrained(

    model_name = "unsloth/gemma-7b-bnb-4bit", # Choose ANY! eg teknium/OpenHermes-2.5-Mistral-7B

    max_seq_length = max_seq_length,

    dtype = dtype,

    load_in_4bit = load_in_4bit,

    token = " ", # use one if using gated models like meta-llama/Llama-2-7b-hf

)


model = FastLanguageModel.get_peft_model(

    model,

    r = 16, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128

    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",

                      "gate_proj", "up_proj", "down_proj",],

    lora_alpha = 16,

    lora_dropout = 0, # Supports any, but = 0 is optimized

    bias = "none",    # Supports any, but = "none" is optimized

    use_gradient_checkpointing = True,

    random_state = 3407,

    use_rslora = False,  # We support rank stabilized LoRA

    loftq_config = None, # And LoftQ

)


alpaca_prompt = """ذیل میں ایک ہدایت ہے جو فلم کے نام کی وضاحت کرتی ہے، اس کے ساتھ ایک ان پٹ بھی ہے جو مزید دستاویزات فراہم کرتا ہے۔ گانے کے بول لکھنے کے لیے ایک لمحہ نکالیں جو فلم کے نام کے معنی سے میل کھاتا ہے۔


### Instruction:

{}


### Input:

{}


### Response:

{}"""


EOS_TOKEN = tokenizer.eos_token # Must add EOS_TOKEN

def formatting_prompts_func(examples):

    instructions = examples["urdu_instruction"]

    inputs       = examples["urdu_input"]

    outputs      = examples["urdu_output"]

    texts = []

    for instruction, input, output in zip(instructions, inputs, outputs):

        # Must add EOS_TOKEN, otherwise your generation will go on forever!

        text = alpaca_prompt.format(instruction, input, output) + EOS_TOKEN

        texts.append(text)

    return { "text" : texts, }

pass


from datasets import load_dataset

dataset = load_dataset("fahdmirzac/urdu_bollywood_songs_dataset", split = "train")

dataset = dataset.map(formatting_prompts_func, batched = True,)


from huggingface_hub import login

access_token = "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"

login(token=access_token)


from trl import SFTTrainer

from transformers import TrainingArguments


trainer = SFTTrainer(

    model = model,

    tokenizer = tokenizer,

    train_dataset = dataset,

    dataset_text_field = "text",

    max_seq_length = max_seq_length,

    dataset_num_proc = 2,

    packing = False, # Can make training 5x faster for short sequences.

    args = TrainingArguments(

        per_device_train_batch_size = 2,

        gradient_accumulation_steps = 4,

        warmup_steps = 5,

        max_steps = 100,

        learning_rate = 2e-4,

        fp16 = not torch.cuda.is_bf16_supported(),

        bf16 = torch.cuda.is_bf16_supported(),

        logging_steps = 1,

        optim = "adamw_8bit",

        weight_decay = 0.01,

        lr_scheduler_type = "linear",

        seed = 3407,

        output_dir = "outputs",

    ),

)


trainer_stats = trainer.train()


FastLanguageModel.for_inference(model) # Enable native 2x faster inference

inputs = tokenizer(

[

    alpaca_prompt.format(

        "دیے گئے فلم کے نام کے بارے میں ایک مختصر گیت کے بول لکھیں۔", # instruction

        "کیوں پیار ہو گیا", # input

        "", # output - leave this blank for generation!

    )

], return_tensors = "pt").to("cuda")


outputs = model.generate(**inputs, max_new_tokens = 200, use_cache = True)

tokenizer.batch_decode(outputs)


FastLanguageModel.for_inference(model) # Enable native 2x faster inference

inputs = tokenizer(

[

    alpaca_prompt.format(

        "دیے گئے فلم کے نام کے بارے میں ایک مختصر گیت کے بول لکھیں۔", # instruction

        "رنگ", # input

        "", # output - leave this blank for generation!

    )

], return_tensors = "pt").to("cuda")


outputs = model.generate(**inputs, max_new_tokens = 200, use_cache = True)

tokenizer.batch_decode(outputs)


model.push_to_hub("fahdmirzac/Gemma_Urdu_Hindi_Bollywood_Songs", token = "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")

Categories: DBA Blogs

Index Usage – 1

Jonathan Lewis - Fri, 2024-03-15 04:21

In 12.2 Oracle introduced Index Usage Tracking to replace the previous option for “alter index xxx monitoring usage”. A recent post on the Oracle database discussion forum prompted me to look for articles about this “new” feature and what people had to say about it. There didn’t seem to be much information online – just a handful of articles starting with Tim Hall a few years ago and ending with Maria Colgan a few months ago – so I thought I’d update my notes a little and publish them.

Unfortunately, by the time I’d written the first 6 pages it was starting to feel like very heavy going, so I decided to rewrite it as a mini-series. In part one I’ll just give you some descriptions and explanations that are missing from the manuals; in part two I’ll do a bit of a fairly shallow dive to talk about what’s happening behind the scenes and how you can do some experiments; in part three I’ll describe some of the experiments and show the results that justify the descriptions I’ve given here in part one.

History

In the bad old days you could enable “monitoring” on an index to see if it was being used. The command to do this was:

alter index {index name} monitoring usage;

After executing this statement you would wait for a bit then check the view dba_object_usage:

SQL> desc dba_object_usage
 Name                          Null?    Type
 ----------------------------- -------- --------------------
 OWNER                         NOT NULL VARCHAR2(128)
 INDEX_NAME                    NOT NULL VARCHAR2(128)
 TABLE_NAME                    NOT NULL VARCHAR2(128)
 MONITORING                             VARCHAR2(3)
 USED                                   VARCHAR2(3)
 START_MONITORING                       VARCHAR2(19)
 END_MONITORING                         VARCHAR2(19)

SQL> select * from dba_object_usage;

OWNER           INDEX_NAME           TABLE_NAME                MON USE START_MONITORING    END_MONITORING
--------------- -------------------- ------------------------- --- --- ------------------- -------------------
TEST_USER       T2_I1                T2                        YES YES 03/12/2024 15:31:35

1 row selected.

As you can see, this didn’t give you much information – just “yes it has been used” or “no it hasn’t been used” since the moment you started monitoring it; and that’s almost totally useless as an aid to measuring or understanding the effectiveness of the index.

Apart from the almost complete absence of information, there were collateral issues: I think that, initially, gathering stats, index rebuilds and using explain plan would flag an index as used; at the opposite extreme indexes that were actually used to avoid foreign key locking problems were not flagged as used.

And now for something completely different

The promise of Index Usage Tracking is clearly visible in the description of the view you use to report the details captured:

SQL> desc dba_index_usage
 Name                                Null?    Type
 ----------------------------------- -------- ------------------------
 OBJECT_ID                           NOT NULL NUMBER
 NAME                                NOT NULL VARCHAR2(128)
 OWNER                               NOT NULL VARCHAR2(128)
 TOTAL_ACCESS_COUNT                           NUMBER
 TOTAL_EXEC_COUNT                             NUMBER
 TOTAL_ROWS_RETURNED                          NUMBER
 BUCKET_0_ACCESS_COUNT                        NUMBER
 BUCKET_1_ACCESS_COUNT                        NUMBER
 BUCKET_2_10_ACCESS_COUNT                     NUMBER
 BUCKET_2_10_ROWS_RETURNED                    NUMBER
 BUCKET_11_100_ACCESS_COUNT                   NUMBER
 BUCKET_11_100_ROWS_RETURNED                  NUMBER
 BUCKET_101_1000_ACCESS_COUNT                 NUMBER
 BUCKET_101_1000_ROWS_RETURNED                NUMBER
 BUCKET_1000_PLUS_ACCESS_COUNT                NUMBER
 BUCKET_1000_PLUS_ROWS_RETURNED               NUMBER
 LAST_USED                                    DATE

Though the columns are not very well described in the reference manuals you can see very clearly that there’s a lot more detail than just “yes/no” here. The columns clearly carry information about “how many times” and “how much data”, breaking the numbers down across a small range-based histogram. Here’s an example of output (using Tom Kyte’s print_table() routine to turn columns to rows):

SQL> execute print_table('select * from dba_index_usage where name = ''T1_I1''')
OBJECT_ID                      : 206312
NAME                           : T1_I1
OWNER                          : TEST_USER
TOTAL_ACCESS_COUNT             : 889
TOTAL_EXEC_COUNT               : 45
TOTAL_ROWS_RETURNED            : 17850
BUCKET_0_ACCESS_COUNT          : 0
BUCKET_1_ACCESS_COUNT          : 0
BUCKET_2_10_ACCESS_COUNT       : 0
BUCKET_2_10_ROWS_RETURNED      : 0
BUCKET_11_100_ACCESS_COUNT     : 878
BUCKET_11_100_ROWS_RETURNED    : 13200
BUCKET_101_1000_ACCESS_COUNT   : 9
BUCKET_101_1000_ROWS_RETURNED  : 1650
BUCKET_1000_PLUS_ACCESS_COUNT  : 2
BUCKET_1000_PLUS_ROWS_RETURNED : 3000
LAST_USED                      : 11-mar-2024 20:26:26

The order of the columns is just a little odd (in my opinion) so I’ve switched two of them around in my descriptions below:

  • Total_exec_count: is the total number of executions that have been captured for SQL statements using this index.
  • Total_access_count: is the total number of scans of this index that have been observed. If you think of a nested loop join you will appreciate that a single execution of an SQL statement could result in many accesses of an index – viz: an index range scan into the inner (second) table may happen many times, once for each row acquired from the outer (first) table.
  • Total_rows_returned: carries a little trap in the word rows, and in the word returned. In this context “rows” means “index entries”, and “returned” means “passed to the parent operation”. (To be confirmed / clarified)
  • Bucket_0_access_count: how many index accesses found no rows and there’s no bucket_0_row_count needed because it would always be 0).
  • Bucket_1_access_count: how many index accesses found just one row (and there’s no bucket_1_row_count because that would always match the access count).
  • Bucket_M_N_access_count: how many index accesses found between M and N rows.
  • Bucket_M_N_row_count: sum of rows across all the index accesses that returned between M and N rows.
  • Last_used: date and time of the last flush that updated this row of the table/view.

The most important omission in the descriptions given in the manuals is the difference between total_exec_count and total_access_count. (It was a comment on Maria Colgan’s blog note asking about the difference that persuaded me that I really had to write this note.) If you don’t know what an “access” is supposed to be you can’t really know how to interpret the rest of the numbers.

Take another look at the sample output above, it shows 45 executions and 889 accesses – I happen to know (because I did the test) that most of the work I’ve done in this interval has been reporting a two-table join that uses a nested loop from t2 into t1 using an index range scan on index t1_i1 to access table t1. I know my data well enough to know that every time I run my query it’s going to find about 20 rows in t2, and that for every row I find in t2 there will be roughly 15 rows that I will access in t1 through the index.

Give or take a little extra activity round the edges that blur the numbers I can see that the numbers make sense:

  • 45 executions x 20 rows from t2 = 900 index range scans through t1_i1
  • 878 index ranges scans x 15 rows per scan = 13,170

The numbers are in the right ball-park to meet my expectations. But we do have 11 more accesses reported – 9 of them reported an average of 1,650/9 = 183 rows, 2 of them reported an average of 3,000/2 = 1500 rows. Again, I know what I did, so I can explain why those numbers have appeared, but in real life you may have to do a little work to find a reasonable explanation (Spoilers: be suspicious about gathering index stats)

It’s possible, for example, that there are a few rows in the t2 table that have far more than the 15 row average in t1 and the larger numbers are just some examples from the nested loop query that happened to hit a couple of these outliers in t2. (It’s worth highlighting, as a follow-up to this suggestion, that a single execution could end up reporting accesses and row counts in multiple buckets.)

In fact the 9 “medium sized” access were the result of single table queries using a “between” clause that ranged through 10 to 15 values of t1 (returning 150 to 225 rows each), and the two “large” accesses were the result of two index-only queries where I forced an index full scan and an index fast full scan that discarded half the rows of an index holding 3,000 entries.

As I said, I’ll be presenting a few examples in part 3, but a guideline that may be helpful when considering the executions, accesses, and rowcounts is this: if you’re familiar with the SQL Monitor report then you’ll know that each call to dbms_sql_monitor.report_sql_monitor() reports one execution – then the Starts column for any index operation will (probably) be the total access count, and the Rows (Actual) column will (probably) be the total rows returned. As noted above, though, any one execution may end up splitting the total Starts and Rows (Actual) across multiple buckets.

Some questions to investigate

I hope this has given you enough information to get you interested in Index Usage Tracking, and some idea of what you’re looking at when you start using the view. There are, however, some technical details you will need to know if you want to do some testing before taking any major steps in production. There are also some questions that ought to be addressed before jumping to conclusions about what the numbers mean, so I thought I’d list several questions that came to mind when I first read about the feature:

  • Does a call to dbms_stats.gather_index_stats result in an update to the index usage stats, and does it matter?
  • Does a call to explain plan result in an update to the index usage stats, and does it matter.
  • Do referential integrity checks result in the parent or child indexes being reported in the usage stats. What if there is a parent delete with “on delete cascade” on the child.
  • Do inserts, updates, deletes or merges produce any unexpected results (e.g. double / quadruple counting); what if they’re PL/SQL forall bulk processing, what if (e.g.) you update or delete through a join view.
  • Does an index skip scan count as a single access, or does Oracle count each skip as a separate access (I’d hope it would be one access).
  • If you have an index range scan with a filter predicate applied to each index entry after the access predicate is the “rows returned” the number of index entries examined (accessed), or the number that survive the filter. (I would like it to be the number examined because that’s the real measure of the work done in the index but the name suggests it counts the survivors.)
  • Does an index fast full scan get reported correctly.
  • Are IOTs accounted differently from ordinary B-tree indexes
  • For bitmap indexes what is a “row” and what does the tracking information look like?
  • If you have an “Inlist Iterator” operation does this get summed into one access, or is it an access per iteration (which is what I would expect). And how is the logic applied with partitioned table iteration.
  • Does a UNION or UNION ALL operation count multiple accesses (I would expect so), and what happens with things like nvl_or_expansion with “conditional” branches.
  • Does a “connect by pump” through an index produce any unexpected results
  • Can index usage tracking tell us anything about Domain indexes
  • Are there any types of indexes that are not tracked (sys indexes, for example)

If you can think of any other questions where “something different” might happen, feel free to add them as comments.

Summary

Index Usage Tracking (and the supporting view dba_index_usage) can give you a good insight into how Oracle is using your indexes. This note explains the meaning of data reported in the view and a couple of ideas about how you may need to interpret the numbers for a single index.

In the next two articles we’ll look at some of the technical aspects of the feature (including how to enable and test it), and the results captured from different patterns of query execution, concluding (possibly in a 4th article) in suggestions of how to use the feature in a production system.

Footnote

At the start of this note I said it had been prompted by a question on one of the Oracle forums. The thread was about identifying indexes that could be dropped and the question was basically: “Is the old index monitoring obsolete?” The answer is “Yes, definitely, and it has been for years.”

Using Claude 3 Haiku Vision with Amazon Bedrock Locally

Pakistan's First Oracle Blog - Fri, 2024-03-15 02:58

 This video is a hands-on guide as how to use vision features of Anthropic's Claude 3 Haiku AI model with Amazon Bedrock.



Code Used:

import boto3
import json
import base64
from botocore.exceptions import ClientError

bedrock = boto3.client(service_name="bedrock-runtime",region_name='us-east-1')

modelId = "anthropic.claude-3-haiku-20240307-v1:0"

accept = "application/json"
contentType = "application/json"


# prompt = "What is written in this image?"
# image_path = "./images/ab55.png"

# prompt = "How many faces are there in this image and what are the expressions of those faces?"
# image_path = "./images/expression.png"

# prompt = "Tell me a short story about this image."
# image_path = "./images/hiking.png"

prompt = "What's the location in this image?"
image_path = "./images/whereisthis.png"


with open(image_path, "rb") as image_file:
    image = base64.b64encode(image_file.read()).decode("utf8")

request_body = {
    "anthropic_version": "bedrock-2023-05-31",
    "max_tokens": 2048,
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": prompt,
                },
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/png",
                        "data": image,
                    },
                },
            ],
        }
    ],
}

try:
    response = bedrock.invoke_model(
        modelId=modelId,
        body=json.dumps(request_body),
    )

    # Process and print the response
    result = json.loads(response.get("body").read())
    input_tokens = result["usage"]["input_tokens"]
    output_tokens = result["usage"]["output_tokens"]
    output_list = result.get("content", [])

    # print("Invocation details:")
    # print(f"- The input length is {input_tokens} tokens.")
    # print(f"- The output length is {output_tokens} tokens.")

    # print(f"- The model returned {len(output_list)} response(s):")
    for output in output_list:
        print(output["text"])

except ClientError as err:
    print(
        "Couldn't invoke Claude 3 Haiku Vision. Here's why: %s: %s",
        err.response["Error"]["Code"],
        err.response["Error"]["Message"],
    )
    raise
Categories: DBA Blogs

Kompaktkurs Pickleball für Einsteiger

The Oracle Instructor - Fri, 2024-03-15 01:41

Alles was ihr braucht, um mit Pickleball qualifiziert zu starten. Auf den Punkt gebracht an einem Tag!
Dieser 4-stündige Kompaktkurs ist ideal geeignet für Berufstätige und sportlich ambitionierte Senioren.

Der Kurs kostet 40 Euro pro Person und wird geleitet von Uwe Hesse – einem erfahrenen Spieler und vom DPB zertifizierten Trainer.
Er findet statt in Düsseldorf.
Die Teilnehmerzahl ist je Kurs begrenzt auf 8, um ein intensives und individuelles Coaching gewährleisten zu können.

Inhalte sind u.a.
Grundschläge: Aufschlag, Return, Volley, Dink
Regelkunde
Basisstrategien im Doppel
Zählweise
Bedeutung der Non-Volley-Zone
3rd Shot Drop
Teilnehmer-Doppel mit Trainerfeedback

Aktuelle Termine:
Samstag, 27. April
Samstag, 04. Mai

Beginn ist jeweils 10:00 Uhr.
Anmeldungen bitte ausschließlich per Mail an info@uhesse.com

Für Mitglieder des DJK Agon 08 kostet der Kurs nur 20 Euro.
Bei nachfolgendem Vereinseintritt werden 20 Euro erstattet.

Categories: DBA Blogs

Create AI Agent in AWS with Boto3 Code

Pakistan's First Oracle Blog - Thu, 2024-03-14 22:03

 This video is a step-by-step tutorial with code as how to create Amazon Bedrock AI agents with boto3 in Python to integrate with Lambda.



Code used: Just use any lambda with it of your choice.


import logging
import boto3
import time
import yaml
import json
import io
from botocore.exceptions import ClientError

def create_agent(bedrock, agent_name, foundation_model, role_arn, instruction):
    try:
        # Create a low-level client with the service name
        response = bedrock.create_agent(
            agentName=agent_name,
            foundationModel=foundation_model,
            agentResourceRoleArn=role_arn,
            instruction=instruction,
        )
    except ClientError as e:
        logging.error(f"Couldn't create agent due to: {e}")
        raise
    else:
        return response["agent"]

def create_agent_action_group(bedrock, name, description, agent_id, agent_version, function_arn, api_schema):
    try:
        response = bedrock.create_agent_action_group(
            actionGroupName=name,
            description=description,
            agentId=agent_id,
            agentVersion=agent_version,
            actionGroupExecutor={"lambda": function_arn},
            apiSchema={"payload": api_schema},
        )
        agent_action_group = response["agentActionGroup"]
    except ClientError as e:
        print(f"Error: Couldn't create agent action group. Here's why: {e}")
        raise
    else:
        return agent_action_group

def prepare_agent(bedrock, agent_id):
    try:
        prepared_agent_details = bedrock.prepare_agent(agentId=agent_id)
    except ClientError as e:
        print(f"Couldn't prepare agent. {e}")
        raise
    else:
        return prepared_agent_details

def create_agent_alias(bedrock, name, agent_id):
    try:
        response = bedrock.create_agent_alias(
            agentAliasName=name, agentId=agent_id
        )
        agent_alias = response["agentAlias"]
    except ClientError as e:
        print(f"Couldn't create agent alias. {e}")
        raise
    else:
        return agent_alias



def main():
    # Define your parameters
    bedrock = boto3.client(service_name='bedrock-agent',region_name='us-east-1')
    agent_name = 'AstroAI'
    foundation_model = 'anthropic.claude-v2'
    role_arn = 'bedrock role arn'
    instruction = 'Your task is to generate unique and insightful daily horoscopes for individuals \
                   based on their zodiac sign. Start by analyzing the general characteristics and common \
                   themes associated with each zodiac sign. Consider traits, challenges, opportunities, \
                   and the emotional and physical wellbeing of individuals under each sign. Use this \
                   understanding to create personalized, relevant, and engaging horoscopes that offer \
                   guidance, reflection, and encouragement for the day ahead. Ensure the horoscopes \
                   are varied and resonate with the unique qualities of each sign, contributing \
                   positively to the users day.'

    # Call the create_agent function
    try:
        agent = create_agent(bedrock, agent_name, foundation_model, role_arn, instruction)
        agent_id = agent['agentId']
        print(f"Agent created successfully: {agent_id}")
    except ClientError:
        print("Failed to create the agent.")

    time.sleep(10)

    try:
        with open("api_schema.yaml") as file:
            api_schema=json.dumps(yaml.safe_load(file))
            name="AstroGroup"
            description="AI Astrologer"
            agent_version="DRAFT"
            function_arn="arn:aws:lambda:us-east-1::function:horoscope"
            agentgroup = create_agent_action_group(bedrock, name, description, agent_id, agent_version, function_arn, api_schema)                
            print(agentgroup['actionGroupId'])
    except ClientError as e:
        print(f"Couldn't create agent action group. Here's why: {e}")
        raise        

    time.sleep(5)

    agentprepared = prepare_agent(bedrock, agent_id)                
    print(agentprepared)

    time.sleep(20)

    agentalias = create_agent_alias(bedrock, name, agent_id)
    print(agentalias['agentAliasId'])

if __name__ == "__main__":
    main()

Categories: DBA Blogs

PLS-00103: Encountered the symbol "RECORD" when expecting one of the following: array varray table object fixed varying opaque sparse

Tom Kyte - Thu, 2024-03-14 13:06
Here i am creating record type with reocrds emp and dept with following syntax <code>CREATE TYPE emp_dept_data IS RECORD (empno number(4), ENAME VARCHAR2(10), JOB VARCHAR2(9), HIREDATE DATE, SAL NUMBER(7,2), DNAME VARCHAR2(14) );</code> I am getting error as PLS-00103: Encountered the symbol "RECORD" when expecting one of the following: array varray table object fixed varying opaque sparse Please tell me how to fix it i am using oracle 19C version Record type is used in pipelined function
Categories: DBA Blogs

Musk Ox

Greg Pavlik - Wed, 2024-03-13 10:04

 Musk Ox is a fantastic instrumental chamber folk project from up in Canada. I recently stumbled on these guys and their guitarist Nathanael Larochette from some collaborations he did with the (now defunct, but often fantastic) Oregon-based post metal/neofolk band Agalloch. Here's a neat documentary on the making of their album Woodfall.



Another cool project is the acoustic spin off that Nathanael did from the last Agalloch album.



His solo stuff is great. Music about trees and such.

Querying LONGs

Jonathan Lewis - Wed, 2024-03-13 09:19

Update for 23c: If your only need for using LONGs in predicates is to query the partitioning views by high_value you won’t need to read this note as the views now expose columns high_value_clob and high_value_json. (See comment #3 below.)

Despite their continued presence in the Oracle data dictionary, LONG columns are not an option that anyone should choose; you can’t do much with them and they introduce a number of strange space management problems. Nevertheless a recent thread on the Oracle database forum started with the following question: “How do you use LONG columns in a WHERE clause?”. The basic answer is: “You don’t”.

This user wanted to query all_tab_partitions for a given table_name and high_value, and the high_value is (still) a LONG, so attempts to use it resulted in Oracle error “ORA-00997: illegal use of LONG datatype”. A possible, and fairly obvious but undesirable, solution to the requirement is to write a PL/SQL function to read the current row from all_tab_partitions and returns the first N characters of the high_value as a varchar2(). Here’s a version (not quite the one I posted) of such a function, with a sample of use:

rem
rem     Script:         get_high_value.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Mar 2024
rem 

create or replace function get_high_value (
        i_tab_owner varchar2,
        i_tab_name varchar2, 
        i_part_name varchar2,
        i_part_posn number
)
return varchar2
is
        v1 varchar2(4000);
begin
        select  atp.high_value 
        into    v1 
        from    all_tab_partitions atp
        where   atp.table_owner         = upper(i_tab_owner)
        and     atp.table_name          = upper(i_tab_name)
        and     atp.partition_name      = upper(i_part_name)
        and     atp.partition_position  = upper(i_part_posn)
        ;
        return v1;
end;
/

select
        apt.table_owner, apt.table_name, 
        apt.tablespace_name, 
        apt.partition_name, apt.partition_position,
        apt.high_value
from
        all_tab_partitions      apt
where
        apt.table_owner = 'TEST_USER'
and     apt.table_name  = 'PT_RANGE'
and     get_high_value(
                apt.table_owner, 
                apt.table_name, 
                apt.partition_name,
                apt.partition_position
        ) = '200'
/

This seemed to work quite well and sufficiently rapidly – but I only had two partitioned tables in my schema and a total of 12 partitions, so it’s not sensible to look at the clock to see how efficient the query is.

Another possible solution introduced me to a function that has been around for years (and many versions) which I had never come across: sys_dburigen(). PaulZip supplied the following code (which I’ve altered cosmetically and edited to pick up a table in my schema):

select  *
from    (
        select 
                dbms_lob.substr(
                        sys_dburigen (
                                atp.table_owner, 
                                atp.table_name, 
                                atp.partition_name, 
                                atp.partition_position, 
                                atp.high_value, 
                                'text()'
                ).getclob(), 4000, 1)           high_value_str,
                atp.table_owner, atp.table_name, atp.partition_name,
                atp.tablespace_name, atp.high_value
        from    all_tab_partitions atp
        where   atp.table_owner = 'TEST_USER'
        and     atp.table_name  = 'PT_RANGE'
        )
where   high_value_str = '200'
/

This was so cute, and looked like a much nicer (i.e. “legal”) solution than my PL/SQL hacking that I had to take a closer look at sys_dburigen() – first to understand what it was supposed achieve (yes, I do RTFM) then to see how it actually worked.

Something I did first was simply to strip back the layers of the expression used to supplied the high_value_str which took me through the following four combinations (with and without ‘text’ , with and without ‘get_clob’). Each expression is followed by the result for the row selected above:

sys_dburigen (atp.table_owner, atp.table_name, atp.partition_name, atp.partition_position, atp.high_value, 'text()').getclob()

200

---

sys_dburigen (atp.table_owner, atp.table_name, atp.partition_name, atp.partition_position, atp.high_value).getclob()

<?xml version="1.0"?><HIGH_VALUE>200</HIGH_VALUE>

--

sys_dburigen (atp.table_owner, atp.table_name, atp.partition_name, atp.partition_position, atp.high_value, 'text()')

DBURITYPE('/PUBLIC/ALL_TAB_PARTITIONS/ROW[TABLE_OWNER=''TEST_USER'' and TABLE_NAME=''PT_RANGE'' and PARTITION_NAME=''P200'' and PARTITION_POSITION=''1'']/HIGH_VALUE/text()', NULL)

--

sys_dburigen (atp.table_owner, atp.table_name, atp.partition_name, atp.partition_position, atp.high_value)

DBURITYPE('/PUBLIC/ALL_TAB_PARTITIONS/ROW[TABLE_OWNER=''TEST_USER'' and TABLE_NAME=''PT_RANGE'' and PARTITION_NAME=''P200'' and PARTITION_POSITION=''1'']/HIGH_VALUE', NULL)

Working from the bottom pair up we see that we start by generating a dburitype which defines the type of thing we want to query and the restriction we want to use while querying. The ‘text()’ option simply adds an extra detail to the dburitype.

The top pair shows us that the get_clob() will then return the value we have requested, either as an XML value, or as the text value described by the XML value if we’ve supplied the ‘text()’ option.

Our call to sys_dburigen() has specified an object we want to access, and 4 columns in that object that will identify a unique row in that object, and a fifth column that we want returned either as an XML value or as a text value.

Tracing

I actually worked through the analysis in the opposite direction to the one I’ve been showing. When the call to sys_dburigen() I suspected that it might be doing the same thing as my PL/SQL function call, so I ran the two queries with SQL tracing enabled to see what activity took place at the database.

Ignoring driving query against all_tab_partitions the content of the PL/SQL trace was basically 3 executions (I had 3 partitions in the pt_range table) of:

SELECT ATP.HIGH_VALUE
FROM
 ALL_TAB_PARTITIONS ATP WHERE ATP.TABLE_OWNER = UPPER(:B4 ) AND
  ATP.TABLE_NAME = UPPER(:B3 ) AND ATP.PARTITION_NAME = UPPER(:B2 ) AND
  ATP.PARTITION_POSITION = UPPER(:B1 )

The content of the sys_dburigen() trace was 3 executions of a query like:

SELECT alias000$."HIGH_VALUE" AS HIGH_VALUE
FROM
 "ALL_TAB_PARTITIONS" alias000$ WHERE 1 = 1 AND ((((alias000$."TABLE_OWNER"=
  'TEST_USER') AND (alias000$."TABLE_NAME"='PT_RANGE')) AND
  (alias000$."PARTITION_NAME"='P200')) AND (alias000$."PARTITION_POSITION"=
  '1'))

Note particularly the literal values in the predicates in lines 4, 5 and 6. This version of the code has to generate and optimise (hard-parse) a new SQL statement for every partition in the table referenced in the driving query. For a table with a large number of partitions, and a system with a large number of partitioned tables, the disruption of shared pool that this might cause could be severe if (as the user said at one point) “we will be frequently selecting from all_tab_partitions”. [Damage limitation: if the session sets cursor_sharing to FORCE temporarily then the generated SQL will be subject to bind variable substitution; but that’s not an ideal workaround.]

Summary

Using LONG columns in SQL predicates is not nice – and not likely to be efficient – but there are ways of working around the limitations of LONGs. It’s undesirable to use PL/SQL that calls SQL inside a SQL statement, but we can use a PL/SQL function to return a string from a LONG in the current row – and since that’s pretty much what Oracle seems to be doing with its call to sys_dburigen() it’s hard to insist that the PL/SQL strategy is inappropriate. (But maybe the call to sys_dburigen() in this context would be considered an abuse of a feature anyway – even though it seems much more elegant and flexible once you’ve learned a little about how it works.)

Footnote

As another detail on analysing the cost/benefit of different approaches – it would be possible to avoid creating the pl/sql function by embedding it in the SQL as a “with function” clause:

with function get_high_value (
        i_tab_owner varchar2,
        i_tab_name varchar2, 
        i_part_name varchar2,
        i_part_posn number
)
return varchar2
is
        v1 varchar2(4000);
begin
        select  atp.high_value 
        into    v1 
        from    all_tab_partitions atp
        where   atp.table_owner         = upper(i_tab_owner)
        and     atp.table_name          = upper(i_tab_name)
        and     atp.partition_name      = upper(i_part_name)
        and     atp.partition_position  = upper(i_part_posn)
        ;
        return v1;
end;
select
        apt.table_owner, apt.table_name, 
        apt.tablespace_name, 
        apt.partition_name, apt.partition_position,
        apt.high_value
from
        all_tab_partitions      apt
where
        apt.table_owner = 'TEST_USER'
and     apt.table_name  = 'PT_RANGE'
and     get_high_value(
                apt.table_owner, 
                apt.table_name, 
                apt.partition_name,
                apt.partition_position
        ) = '200'
/

Footnote 2

I have asked the user why they want to query all_tab_partitions by high_value since it seems to be a slightly odd thing to do and there may be a better way of doing whatever it is that this query is supposed to support. They haven’t responded to the question, so I’ll take a guess that they want to rename (or move etc.) partitions that they don’t know the name for – perhaps because they are using interval partitioning or automatic list partitioning. If the guess is correct then the solutions offered are irrelevant – you don’t need to know the name of a partition to manipulate it, you need only know some value that is a legal member of the partition:

SQL> select partition_name from user_tab_partitions where table_name = 'PT_RANGE' order by partition_position;

PARTITION_NAME
----------------------
P200
P400
P600

3 rows selected.

SQL> alter table pt_range rename partition for (199) to pt_0200;

Table altered.

SQL> select partition_name from user_tab_partitions where table_name = 'PT_RANGE' order by partition_position;

PARTITION_NAME
----------------------
PT_0200
P400
P600

3 rows selected.

Missing Partition

Jonathan Lewis - Tue, 2024-03-12 07:20

Here’s a silly little detail about execution plans on (interval) partitioned tables that I hadn’t noticed until it showed up on this thread on a public Oracle forum: it’s an execution plan that claims that Oracle will be visiting a partition that clearly won’t be holding the data requested.

Here’s the starting section of a demonstration script – mostly by Solomon Yakobson with minor tweaks and additions from me:

rem
rem     Script:         non_existent_partition.sql
rem     Author:         Solomon Yakobson / Jonathan Lewis
rem     Dated:          Mar 2024
rem
rem     Last tested 
rem             19.11.0.0
rem

create table invoices_partitioned(
        invoice_no   number not null,
        invoice_date date   not null,
        comments     varchar2(500)
)
partition by range (invoice_date)
interval (interval '3' month)
(
        partition invoices_past values less than (date '2023-01-01')
);

insert into invoices_partitioned
select  level,
        date '2023-01-01' + numtoyminterval(3 * (level - 1),'month'),
        null
from  dual
connect by level <= 6
/

insert into invoices_partitioned select * from invoices_partitioned;
/
/
/
/
/
/
/
/
/
/
/
/
/
/
/
/

commit
/

execute dbms_stats.gather_table_stats(user,'invoices_partitioned')

set linesize 156
column high_value format a80

select  partition_position, num_rows,
        partition_name,
        high_value
  from  user_tab_partitions
  where table_name = 'INVOICES_PARTITIONED'
  order by partition_position
/

alter table invoices_partitioned drop partition for (date'2023-09-01');
purge recyclebin;

select  partition_position, num_rows,
        partition_name,
        high_value
  from  user_tab_partitions
  where table_name = 'INVOICES_PARTITIONED'
  order by partition_position
/

The script creates an interval partitioned table, with an interval of 3 months, then inserts 131,072 rows per partition (the strange re-execution of “insert into x select from x” was my lazy way of increasing the volume of data from the original one row per partition without having to think too carefully.

After creating the data we report the partition names and high values in order, then drop (and purge) the partition that should hold the value 1st Sept 2023 (which will be the partition with the high_value of 1st Oct 2023) and report the partition names and high values again so that you can see the “gap” in the high values and the adjustment to the partition_position values. Here are the “before” and “after” outputs:

PARTITION_POSITION   NUM_ROWS PARTITION_NAME         HIGH_VALUE
------------------ ---------- ---------------------- --------------------------------------------------------------------------------
                 1          0 INVOICES_PAST          TO_DATE(' 2023-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
                 2     131072 SYS_P39375             TO_DATE(' 2023-04-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
                 3     131072 SYS_P39376             TO_DATE(' 2023-07-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
                 4     131072 SYS_P39377             TO_DATE(' 2023-10-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
                 5     131072 SYS_P39378             TO_DATE(' 2024-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
                 6     131072 SYS_P39379             TO_DATE(' 2024-04-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
                 7     131072 SYS_P39380             TO_DATE(' 2024-07-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA

7 rows selected.



PARTITION_POSITION   NUM_ROWS PARTITION_NAME         HIGH_VALUE
------------------ ---------- ---------------------- --------------------------------------------------------------------------------
                 1          0 INVOICES_PAST          TO_DATE(' 2023-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
                 2     131072 SYS_P39375             TO_DATE(' 2023-04-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
                 3     131072 SYS_P39376             TO_DATE(' 2023-07-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
                 4     131072 SYS_P39378             TO_DATE(' 2024-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
                 5     131072 SYS_P39379             TO_DATE(' 2024-04-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA
                 6     131072 SYS_P39380             TO_DATE(' 2024-07-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIA

6 rows selected.

Now we check the execution plan for a query that would have accessed the partition we’ve just dropped:

explain plan for
select  *
  from  invoices_partitioned
  where invoice_date = date '2023-09-01';

select  *
  from  dbms_xplan.display();


PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------
Plan hash value: 1148008570

---------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name                 | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                      |   109K|  1173K|   104   (1)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE SINGLE|                      |   109K|  1173K|   104   (1)| 00:00:01 |     4 |     4 |
|*  2 |   TABLE ACCESS FULL    | INVOICES_PARTITIONED |   109K|  1173K|   104   (1)| 00:00:01 |     4 |     4 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("INVOICE_DATE"=TO_DATE(' 2023-09-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

The execution plans says it’s going to visit partition number 4 (pstart/pstop) – which we know will definitely cannot be holding any relevant data. If this were an ordinary range-partitioned table – as opposed to interval partitioned – it would be the correct partition for 1st Sept 2024, of course, but it isn’t, so it feels like the pstart/pstop ought to say something like “non-existent” and all the numeric estimates should be zero.

A quick trick for making an interval partition appear without inserting data into it is to issue a “lock table … partition for () …” statement (See footnote to this blog note). I did wonder if the attempt to explain a plan that needed a non-existent partition had actually had the same effect of making Oracle create the partition, so I ran the query against user_tab_partitions again just to check that this hadn’t happend.

So what’s going to happen at run-time: is this an example of “explain plan” telling us a story that’s not going to match what shows up in v$sql_plan (dbms_xplan.display_cursor). Let’s run the query (with rowsource execution stats enabled) and find out:


set serveroutput off
alter session set statistics_level = all;
alter session set "_rowsource_execution_statistics"=true;

select  *
  from  invoices_partitioned
  where invoice_date = date '2023-09-01';


select  *
  from  table(dbms_xplan.display_cursor(format=>'allstats last partition'));


PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------
SQL_ID  d42kw12htubhn, child number 0
-------------------------------------
select  *   from  invoices_partitioned   where invoice_date = date
'2023-09-01'

Plan hash value: 1148008570

---------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name                 | Starts | E-Rows | Pstart| Pstop | A-Rows |   A-Time   |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                      |      1 |        |       |       |      0 |00:00:00.01 |
|   1 |  PARTITION RANGE SINGLE|                      |      1 |    109K|     4 |     4 |      0 |00:00:00.01 |
|*  2 |   TABLE ACCESS FULL    | INVOICES_PARTITIONED |      0 |    109K|     4 |     4 |      0 |00:00:00.01 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("INVOICE_DATE"=TO_DATE(' 2023-09-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

It’s the same plan with the same “wrong” partition identified, and the same estimate for rows returned – but the access never actually happened: Starts = 0 on the table access full.

My hypothesis about this misleading reporting is that Oracle knows from the table definition everything about every partition that might eventually exist – the high_value for the “anchor” partition is known and the interval is known so the appropriate partition number for any partition key value can be derived. Then, at some point, a disconnect appears between the theoretical partition position and the set of physically instantiated partitions, so the optimizer gets the message “theoretically it’s in the 4th partition” and collects the stats from “partition_position = 4” to do the arithmetic and produce the plan.

Fortunately there’s some metadata somewhere that means the run-time engine doesn’t try to access the wrong partition, so this little glitch doesn’t really matter for this simple query – beyond its ability to cause a little confusion.

It’s possible, though, that this behaviour leaves the optimizer with another reason for getting the arithmetic wrong and picking the wrong path if you have a number of “missing” partitions in an interval partitioned table that you’re querying with a range-based predicate that crosses several (non-existent) partitions. So treat this as a warning/clue if you recognise that pattern in any of your partitioned table.

Pages

Subscribe to Oracle FAQ aggregator